z-logo
open-access-imgOpen Access
GE FRST EVALUATION REPORT: HOW WELL DOES A STATISTICALLY‐BASED NATURAL LANGUAGE PROCESSING SYSTEM SCORE NATURAL LANGUAGE CONSTRUCTED‐RESPONSES?
Author(s) -
Burstein Jill C.,
Kaplan Randy M.
Publication year - 1995
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/j.2333-8504.1995.tb01664.x
Subject(s) - natural (archaeology) , natural language , computer science , natural language processing , artificial intelligence , psychology , mathematics education , archaeology , history
There is a considerable interest at Educational Testing Service (ETS) to include performance‐based, natural language constructed‐response items on standarized tests. Such items can be developed, but the projected time and costs required to have these items scored by human graders would be prohibitive. In order for ETS to include these types of items on standardized tests, automated scoring systems need to be developed and evaluated. Automated scoring systems could decrease the time and costs required for human graders to score these items. This report details the evaluation of a statistically‐based scoring system, the General Electric Free‐Response Scoring Tool (GE FRST). GE FRST was designed to score short‐answer, constructed‐responses of up to 17 words. The report describes how the system performs for responses on three different item types. For the sake of efficiency, it is important to evaluate systems on a number of item types to see if the system's scoring method can generalize to a number of item types.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here