Premium
A Comparison of Item Fit Statistics for Mixed IRT Models
Author(s) -
Chon Kyong Hee,
Lee WonChan,
Dunbar Stephen B.
Publication year - 2010
Publication title -
journal of educational measurement
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.917
H-Index - 47
eISSN - 1745-3984
pISSN - 0022-0655
DOI - 10.1111/j.1745-3984.2010.00116.x
Subject(s) - statistics , item response theory , type i and type ii errors , goodness of fit , index (typography) , econometrics , mathematics , sample size determination , psychology , psychometrics , computer science , world wide web
In this study we examined procedures for assessing model‐data fit of item response theory (IRT) models for mixed format data. The model fit indices used in this study include PARSCALE's G 2 , Orlando and Thissen's S − X 2 and S − G 2 , and Stone's χ 2 * and G 2 * . To investigate the relative performance of the fit statistics at the item level, we conducted two simulation studies: Type I error and power studies. We evaluated the performance of the item fit indices for various conditions of test length, sample size, and IRT models. Among the competing measures, the summed score‐based indices S − X 2 and S − G 2 were found to be the sensible and efficient choice for assessing model fit for mixed format data. These indices performed well, particularly with short tests. The pseudo‐observed score indices , χ 2 * and G 2 * , showed inflated Type I error rates in some simulation conditions. Consistent with the findings of current literature, the PARSCALE's G 2 index was rarely useful, although it provided reasonable results for long tests.