z-logo
Premium
Evaluating Rater Accuracy in Performance Assessments
Author(s) -
Engelhard George
Publication year - 1996
Publication title -
journal of educational measurement
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.917
H-Index - 47
eISSN - 1745-3984
pISSN - 0022-0655
DOI - 10.1111/j.1745-3984.1996.tb00479.x
Subject(s) - rasch model , benchmark (surveying) , computer science , set (abstract data type) , item response theory , context (archaeology) , test (biology) , data set , benchmarking , artificial intelligence , psychology , machine learning , data mining , statistics , psychometrics , mathematics , paleontology , geodesy , biology , programming language , geography , marketing , business
A new method for evaluating rater accuracy within the context of performance assessments is described. Accuracy is defined as the match between ratings obtained from operational raters and those obtained from an expert panel on a set of benchmark, exemplar, or anchor performances. An extended Rasch measurement model called the FACETS model is presented for examining rater accuracy. The FACETS model is illustrated with 373 benchmark papers rated by 20 operational raters and an expert panel. The data are from the 1993field test of the High School Graduation Writing Test in Georgia. The data suggest that there are statistically significant differences in rater accuracy; the data also suggest that it is easier to be accurate on some benchmark papers than on others. A small example is presented to illustrate how the accuracy ordering of raters may not be invariant over different subsets of benchmarks used to evaluate accuracy.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here