z-logo
Premium
Monitoring Rater Performance Over Time: A Framework for Detecting Differential Accuracy and Differential Scale Category Use
Author(s) -
Myford Carol M.,
Wolfe Edward W.
Publication year - 2009
Publication title -
journal of educational measurement
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.917
H-Index - 47
eISSN - 1745-3984
pISSN - 0022-0655
DOI - 10.1111/j.1745-3984.2009.00088.x
Subject(s) - rasch model , differential item functioning , rating scale , psychology , differential (mechanical device) , scale (ratio) , inter rater reliability , statistics , item response theory , psychometrics , clinical psychology , developmental psychology , mathematics , physics , quantum mechanics , engineering , aerospace engineering
In this study, we describe a framework for monitoring rater performance over time. We present several statistical indices to identify raters whose standards drift and explain how to use those indices operationally. To illustrate the use of the framework, we analyzed rating data from the 2002 Advanced Placement English Literature and Composition examination, employing a multifaceted Rasch approach to determine whether raters exhibited evidence of two types of differential rater functioning over time (i.e., changes in levels of accuracy or scale category use). Some raters showed statistically significant changes in their levels of accuracy as the scoring progressed, while other raters displayed evidence of differential scale category use over time .

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here