
AN ASSESSMENT OF THE RELATIONSHIP BETWEEN THE ASSUMPTION OF UNIDIMENSIONALITY AND THE QUALITY OF IRT TRUE‐SCORE EQUATING 1 , 2 , 3
Author(s) -
Cook Linda L.,
Dorans Neil J.,
Eignor Daniel R.,
Petersen Nancy S.
Publication year - 1985
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/j.2330-8516.1985.tb00115.x
Subject(s) - equating , item response theory , statistics , test score , test (biology) , scale (ratio) , curse of dimensionality , econometrics , quality (philosophy) , mathematics , dimension (graph theory) , correlation , psychology , psychometrics , rasch model , standardized test , paleontology , philosophy , physics , geometry , epistemology , quantum mechanics , pure mathematics , biology
A strong assumption made by most commonly used item response theory (IRT) models is that the data are unidimensional, i.e., statistical dependence among item scores can be explained by a single ability dimension. One of the major practical applications of item response theory models to testing has been in the area of score equating. This research assesses the relationship between violations of the assumption of unidimensionality and the quality of IRT true‐score equating. First‐order and second‐order factor analyses were conducted on correlation matrices among item parcels. The item parcels were constructed to yield correlation matrices that were amenable to linear factor analyses. The first‐order analyses were employed to assess the effective dimensionality of the item parcel data. Second‐order analyses were employed to test meaningful hypotheses about the structure of the data, hypotheses that were suspected to be pertinent to the quality of equating results. Parcels were constructed for three SAT‐verbal forms and three forms of the Mathematics Level II Achievement test. The quality of IRT true‐score equating was assessed by score scale drift. Scale drift is said to have occurred if the results of equating test form D directly to test form A is not the same as that obtained by equating test form D to test form A through intervening forms B and C. Scale drift was less evident in the Mathematics Level II chain of equatings than it was in the SAT‐verbal chain. The factor analyses uncovered structural similarities and differences across test forms that were consistent with the scale drift results. The dimensionality analyses revealed that the Mathematics Level II item parcels were more nearly unidimensional than the SAT‐verbal item parcels. In addition, the dimensionality analyses revealed that one SAT‐verbal test form and one Mathematics Level II form were each less parallel to the other two forms in their respective equating chains than these other forms were to each other. Refinements in the dimensionality methodology and a more systematic dimensionality assessment are logical extensions of the present research.