Premium
Exploring the Impact of Rater Effects on Person Fit in Rater‐Mediated Assessments
Author(s) -
Wind Stefanie A.
Publication year - 2020
Publication title -
educational measurement: issues and practice
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.158
H-Index - 52
eISSN - 1745-3992
pISSN - 0731-1745
DOI - 10.1111/emip.12354
Subject(s) - psychology , inter rater reliability , test (biology) , interpretation (philosophy) , applied psychology , social psychology , reliability (semiconductor) , developmental psychology , computer science , rating scale , power (physics) , physics , quantum mechanics , biology , programming language , paleontology
Researchers have documented the impact of rater effects, or raters’ tendencies to give different ratings than would be expected given examinee achievement levels, in performance assessments. However, the degree to which rater effects influence person fit, or the reasonableness of test‐takers’ achievement estimates given their response patterns, has not been investigated. In rater‐mediated assessments, person fit reflects the reasonableness of rater judgments of individual test‐takers’ achievement over components of the assessment. This study illustrates an approach to visualizing and evaluating person fit in assessments that involve rater judgment using rater‐mediated person response functions (rm‐PRFs). The rm‐PRF approach allows analysts to consider the impact of rater effects on person fit in order to identify individual test‐takers for whom the assessment results may not have a straightforward interpretation. A simulation study is used to evaluate the impact of rater effects on person fit. Results indicate that rater effects can compromise the interpretation and use of performance assessment results for individual test‐takers. Recommendations are presented that call researchers and practitioners to supplement routine psychometric analyses for performance assessments (e.g., rater reliability checks) with rm‐PRFs to identify students whose ratings may have compromised interpretations as a result of rater effects, person misfit, or both.