Premium
Item Response Models for Multiple Attempts With Incomplete Data
Author(s) -
Bergner Yoav,
Choi Ikkyu,
Castellano Katherine E.
Publication year - 2019
Publication title -
journal of educational measurement
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.917
H-Index - 47
eISSN - 1745-3984
pISSN - 0022-0655
DOI - 10.1111/jedm.12214
Subject(s) - allowance (engineering) , item response theory , computer science , missing data , contrast (vision) , odds , process (computing) , parametric statistics , feature (linguistics) , selection (genetic algorithm) , econometrics , statistics , logistic regression , machine learning , artificial intelligence , mathematics , psychometrics , mechanical engineering , linguistics , philosophy , engineering , operating system
Allowance for multiple chances to answer constructed response questions is a prevalent feature in computer‐based homework and exams. We consider the use of item response theory in the estimation of item characteristics and student ability when multiple attempts are allowed but no explicit penalty is deducted for extra tries. This is common practice in online formative assessments, where the number of attempts is often unlimited. In these environments, some students may not always answer‐until‐correct, but may rather terminate a response process after one or more incorrect tries. We contrast the cases of graded and sequential item response models, both unidimensional models which do not explicitly account for factors other than ability. These approaches differ not only in terms of log‐odds assumptions but, importantly, in terms of handling incomplete data. We explore the consequences of model misspecification through a simulation study and with four online homework data sets. Our results suggest that model selection is insensitive for complete data, but quite sensitive to whether missing responses are regarded as informative (of inability) or not (e.g., missing at random). Under realistic conditions, a sequential model with similar parametric degrees of freedom to a graded model can account for more response patterns and outperforms the latter in terms of model fit.