EVALUATION OF THE  E‐RATER  ®  SCORING ENGINE FOR THE  GRE  ®  ISSUE AND ARGUMENT PROMPTS | Zendy

Ramineni Chaitanya | Zendy; Trapani Catherine S. | Zendy; Williamson David M. | Zendy; Davey Tim | Zendy; Bridgeman Brent | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

EVALUATION OF THE E‐RATER ® SCORING ENGINE FOR THE GRE ® ISSUE AND ARGUMENT PROMPTS

Author(s) -

Ramineni Chaitanya,

Trapani Catherine S.,

Williamson David M.,

Davey Tim,

Bridgeman Brent

Publication year - 2012

Publication title -

ets research report series

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.235

H-Index - 5

ISSN - 2330-8516

DOI - 10.1002/j.2333-8504.2012.tb02284.x

Subject(s) - argument (complex analysis) , task (project management) , inter rater reliability , writing assessment , statistics , psychology , scoring system , computer science , cognitive psychology , artificial intelligence , mathematics , mathematics education , medicine , rating scale , management , surgery , economics

Automated scoring models for the e‐rater ® scoring engine were built and evaluated for the GRE ® argument and issue‐writing tasks. Prompt‐specific, generic, and generic with prompt‐specific intercept scoring models were built and evaluation statistics such as weighted kappas, Pearson correlations, standardized difference in mean scores, and correlations with external measures were examined to evaluate the e‐rater model performance against human scores. Performance was also evaluated across different demographic subgroups. Additional analyses were performed to establish appropriate agreement thresholds between human and e‐rater scores for unusual essays and the impact of using e‐rater on operational scores. The generic e‐rater scoring model with operational prompt‐specific intercept for the issue‐writing task and prompt‐specific e‐rater scoring model for the argument writing task were recommended for operational use. The two automated scoring models were implemented to produce check scores at a discrepancy threshold of 0.5 with human scores.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore