Premium
Effect of Unequal Variances in Proficiency Distributions on Type‐I Error of the Mantel‐Haenszel Chi‐square Test for Differential Item Functioning
Author(s) -
Monahan Patrick O.,
Ankenmann Robert D.
Publication year - 2005
Publication title -
journal of educational measurement
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.917
H-Index - 47
eISSN - 1745-3984
pISSN - 0022-0655
DOI - 10.1111/j.1745-3984.2005.00006
Subject(s) - differential item functioning , statistics , item response theory , type i and type ii errors , rasch model , sample size determination , sample (material) , variance (accounting) , econometrics , psychology , mathematics , psychometrics , chemistry , accounting , chromatography , business
Empirical studies demonstrated Type‐I error (TIE) inflation (especially for highly discriminating easy items) of the Mantel‐Haenszel chi‐square test for differential item functioning (DIF), when data conformed to item response theory (IRT) models more complex than Rasch, and when IRT proficiency distributions differed only in means. However, no published study manipulated proficiency variance ratio (VR). Data were generated with the three‐parameter logistic (3PL) IRT model. Proficiency VRs were 1, 2, 3, and 4. The present study suggests inflation may be greater, and may affect all highly discriminating items (low, moderate, and high difficulty), when IRT proficiency distributions of reference and focal groups differ also in variances. Inflation was greatest on the 21‐item test (vs. 41) and 2,000 total sample size (vs. 1,000). Previous studies had not systematically examined sample size ratio. Sample size ratio of 1:1 produced greater TIE inflation than 3:1, but primarily for total sample size of 2,000.