AGREEMENT BETWEEN EXPERT SYSTEM AND HUMAN RATINGS OF CONSTRUCTED‐RESPONSES TO COMPUTER SCIENCE PROBLEMS | Zendy

Bennett Randy Elliot | Zendy; Gong Brian | Zendy; Kershaw Roger C. | Zendy; Rock Donald A. | Zendy; Soloway Elliot | Zendy; Macalalad Alex | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

AGREEMENT BETWEEN EXPERT SYSTEM AND HUMAN RATINGS OF CONSTRUCTED‐RESPONSES TO COMPUTER SCIENCE PROBLEMS

Author(s) -

Bennett Randy Elliot,

Gong Brian,

Kershaw Roger C.,

Rock Donald A.,

Soloway Elliot,

Macalalad Alex

Publication year - 1988

Publication title -

ets research report series

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.235

H-Index - 5

ISSN - 2330-8516

DOI - 10.1002/j.2330-8516.1988.tb00276.x

Subject(s) - grading (engineering) , microcomputer , interchangeability , computer science , mathematics education , psychology , agreement , cognitive psychology , statistics , artificial intelligence , mathematics , programming language , linguistics , engineering , telecommunications , chip , civil engineering , philosophy

If computers can be programmed to score complex constructed response items, substantial savings in selected ETS programs might be realized and the development of mastery assessment systems that incorporate “real‐world” tasks might be facilitated. This study investigated the extent of agreement between MicroPROUST, a prototype microcomputer‐based expert scoring system, and human readers for two Advanced Placement Computer Science free‐response items. To assess agreement, a balanced incomplete block design was used with two groups of four readers grading 43 student solutions to the first problem and 45 solutions to the second. Readers assigned numeric grades and diagnostic comments in separate readings. Results showed MicroPROUST to be unable to grade a significant portion of solutions, but to perform impressively on those solutions it could analyze. For one problem, MicroPROUST assigned grades and diagnostic comments similar to those assigned by readers. For the other problem, MicroPROUST's agreement with readers on grades was lower than the agreement of readers among themselves, its grades were higher, and it gave fewer comments, particularly on structure and style. The extent of disagreement on grades, however, was small and much of the disagreement disappeared when papers were rescored discounting style. MicroPROUST's interchangeability with human readers on one problem suggests that there are conditions under which automated scoring of complex constructed‐responses might be implemented by ETS.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore