
Evaluation of Different Scoring Rules for a Noncognitive Test in Development
Author(s) -
Guo Hongwen,
Zu Jiyun,
Kyllonen Patrick,
Schmitt Neal
Publication year - 2016
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/ets2.12089
Subject(s) - item response theory , reliability (semiconductor) , test (biology) , computer science , statistics , equating , classical test theory , machine learning , data mining , psychology , econometrics , artificial intelligence , psychometrics , mathematics , rasch model , paleontology , power (physics) , physics , quantum mechanics , biology
In this report, systematic applications of statistical and psychometric methods are used to develop and evaluate scoring rules in terms of test reliability. Data collected from a situational judgment test are used to facilitate the comparison. For a well‐developed item with appropriate keys (i.e., the correct answers), agreement among various item‐scoring rules is expected in the item‐option characteristic curves. In addition, when models based on item‐response theory fit the data, test reliability is greatly improved, particularly if the nominal response model and its estimates are used in scoring.