Premium
Using Patterns of Summed Scores in Paper‐and‐Pencil Tests and Computer‐Adaptive Tests to Detect Misfitting Item Score Patterns
Author(s) -
Meijer Rob R.
Publication year - 2004
Publication title -
journal of educational measurement
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.917
H-Index - 47
eISSN - 1745-3984
pISSN - 0022-0655
DOI - 10.1111/j.1745-3984.2004.tb01110.x
Subject(s) - item response theory , statistics , mathematics , nonparametric statistics , parametric statistics , computerized adaptive testing , statistic , context (archaeology) , logistic regression , trait , psychometrics , computer science , programming language , paleontology , biology
Two new methods have been proposed to determine unexpected sum scores on sub‐tests (testlets) both for paper‐and‐pencil tests and computer adaptive tests. A method based on a conservative bound using the hypergeometric distribution, denoted p, was compared with a method where the probability for each score combination was calculated using a highest density region (HDR). Furthermore, these methods were compared with the standardized log‐likelihood statistic with and without a correction for the estimated latent trait value (denoted as l* z and l z , respectively). Data were simulated on the basis of the one‐parameter logistic model, and both parametric and non‐parametric logistic regression was used to obtain estimates of the latent trait. Results showed that it is important to take the trait level into account when comparing subtest scores. In a nonparametric item response theory (IRT) context, on adapted version of the HDR method was a powerful alterative to p. In a parametric IRT context, results showed that l* z had the highest power when the data were simulated conditionally on the estimated latent trait level.