
Estimating Item Difficulty With Comparative Judgments
Author(s) -
Attali Yigal,
Saldivia Luis,
Jackson Carol,
Schuppan Fred,
Wanamaker Wilbur
Publication year - 2014
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/ets2.12042
Subject(s) - rank (graph theory) , psychology , test (biology) , estimation , item response theory , paired comparison , statistics , econometrics , psychometrics , mathematics , developmental psychology , paleontology , management , combinatorics , economics , biology
Previous investigations of the ability of content experts and test developers to estimate item difficulty have, for the most part, produced disappointing results. These investigations were based on a noncomparative method of independently rating the difficulty of items. In this article, we argue that, by eliciting comparative judgments of difficulty, judges can more accurately estimate item difficulties. In this study, judges from different backgrounds rank ordered the difficulty of SAT ® mathematics items in sets of 7 items. Results showed that judges are reasonably successful in rank ordering several items in terms of difficulty, with little variability across judges and content areas. Simulations of a possible implementation of comparative judgments for difficulty estimation show that it is possible to achieve high correlations between true and estimated difficulties with relatively few comparisons. Implications of these results for the test development process are discussed.