z-logo
open-access-imgOpen Access
REMOTE SCORING OF ESSAYS
Author(s) -
Breland Hunter M.,
Jones Robert J.
Publication year - 1988
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/j.2330-8516.1988.tb00260.x
Subject(s) - reliability (semiconductor) , set (abstract data type) , correlation , psychology , scoring system , criterion validity , statistics , computer science , mathematics education , mathematics , psychometrics , medicine , construct validity , power (physics) , physics , geometry , surgery , quantum mechanics , programming language
Essays written by college freshmen on two different topics were scored first by readers working in a conference setting and second by another set of readers working in their own homes or offices. The conference readers were trained in the standard manner on the specific topics to be scored and were monitored by table leaders, as is done in standard scoring procedures. The remote readers received only written instructions in the mail and there was no monitoring of their scoring. Analyses of the reliability and validity of the two scoring methods were then conducted. Reliability comparisons favored the conference method over the remote method (.75 for conference scoring versus an average of .62 for remote scoring of two essays by three readers). Validity was assessed through multiple correlations with two criteria: (1) a four‐essay criterion, with each essay scored by three readers different from those who scored the essays being examined, and (2) freshman English course grade. The conference scores yielded a slightly higher average multiple correlation with the four‐essay criterion than did the remote scores (.67 versus .63), but for the English course grade criterion the average multiple correlation was the same for both types of scoring (.50). Average incremental validity (the contribution of essay score over and beyond the multiple correlation using multiple‐choice scores—TSWE and ECT) was .06 for the conference scoring and .05 for the remote scoring when the four‐essay criterion was used. The remote scores were also statistically calibrated by adjusting scores for reader tendencies. These calibrations were effective in reducing score discrepancies, and reliabilities and validities were increased slightly as a result of the calibrations. The comparisons conducted suggest some promise for remote scoring, especially if more sophisticated calibration procedures can be developed and if improved reader monitoring can be implemented, but with current procedures remote scoring is not as effective as conference scoring. Under certain circumstances, however, where cost or convenience may be relatively more important than reliability and validity, remote scoring may be preferable to conference scoring.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here