
ANALYTIC SCORING OF TOEFL ® CBT ESSAYS: SCORES FROM HUMANS AND E‐RATER ®
Author(s) -
Lee YongWon,
Gentile Claudia,
Kantor Robert
Publication year - 2008
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/j.2333-8504.2008.tb02087.x
Subject(s) - context (archaeology) , inter rater reliability , psychology , reliability (semiconductor) , feature (linguistics) , test of english as a foreign language , test (biology) , variable (mathematics) , rating scale , statistics , mathematics , mathematics education , developmental psychology , language assessment , linguistics , paleontology , power (physics) , physics , philosophy , mathematical analysis , quantum mechanics , biology
The main purpose of the study was to investigate the distinctness and reliability of analytic (or multitrait) rating dimensions and their relationships to holistic scores and e‐rater ® essay feature variables in the context of the TOEFL ® computer‐based test (CBT) writing assessment. Data analyzed in the study were analytic and holistic essay scores provided by human raters and essay feature variable scores computed by e‐rater (version 2.0) for two TOEFL CBT writing prompts. It was found that (a) all of the six analytic scores were not only correlated among themselves but also correlated with the holistic scores, (b) high correlations obtained among holistic and analytic scores were largely attributable to the impact of essay length on both analytic and holistic scoring, (c) there may be some potential for profile scoring based on analytic scores, and (d) some strong associations were confirmed between several e‐rater variables and analytic ratings. Implications are discussed for improving the analytic scoring of essays, validating automated scores, and refining e‐rater essay feature variables.