
Performance of the S−χ2 Statistic for the Multidimensional Graded Response Model
Author(s) -
Shiyang Su,
Chun Wang,
David J. Weiss
Publication year - 2020
Publication title -
educational and psychological measurement
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.819
H-Index - 95
eISSN - 1552-3888
pISSN - 0013-1644
DOI - 10.1177/0013164420958060
Subject(s) - polytomous rasch model , statistic , econometrics , context (archaeology) , statistics , goodness of fit , item response theory , homogeneous , sample size determination , sample (material) , psychology , index (typography) , mathematics , computer science , psychometrics , paleontology , combinatorics , world wide web , biology , chemistry , chromatography
S - χ 2is a popular item fit index that is available in commercial software packages such as flex MIRT. However, no research has systematically examined the performance of S - χ 2for detecting item misfit within the context of the multidimensional graded response model (MGRM). The primary goal of this study was to evaluate the performance of S - χ 2under two practical misfit scenarios: first, all items are misfitting due to model misspecification, and second, a small subset of items violate the underlying assumptions of the MGRM. Simulation studies showed that caution should be exercised when reporting item fit results of polytomous items using S - χ 2within the context of the MGRM, because of its inflated false positive rates (FPRs), especially with a small sample size and a long test. S - χ 2performed well when detecting overall model misfit as well as item misfit for a small subset of items when the ordinality assumption was violated. However, under a number of conditions of model misspecification or items violating the homogeneous discrimination assumption, even though true positive rates (TPRs) of S - χ 2were high when a small sample size was coupled with a long test, the inflated FPRs were generally directly related to increasing TPRs. There was also a suggestion that performance of S - χ 2was affected by the magnitude of misfit within an item. There was no evidence that FPRs for fitting items were exacerbated by the presence of a small percentage of misfitting items among them.