
PSYCHOMETRIC AND COGNITIVE FUNCTIONING OF AN UNDERDETERMINED COMPUTER‐BASED RESPONSE TYPE FOR QUANTITATIVE REASONING
Author(s) -
Bennett Randy Elliot,
Morley Mary,
Quardt Dennis,
Singley Mark K.,
Rock Donald A.,
Katz Irvin R.,
Nhouyvanisvong Adisack
Publication year - 1998
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/j.2333-8504.1998.tb01778.x
Subject(s) - psychology , cognition , affect (linguistics) , reliability (semiconductor) , consistency (knowledge bases) , perception , cognitive skill , cognitive psychology , test (biology) , underdetermined system , internal consistency , developmental psychology , item response theory , psychometrics , social psychology , artificial intelligence , mathematics , communication , computer science , neuroscience , biology , paleontology , power (physics) , physics , quantum mechanics , algorithm
This study evaluated the psychometric and cognitive functioning of a new computer‐delivered response type for measuring quantitative reasoning skill. This open‐ended, automatically scorable response type, called “Generating Examples,” presents underdetermined problems that can have many right answers. Two GE tests were randomly spiraled among a group of paid volunteers. The tests differed in the manipulation of specific item features hypothesized to affect difficulty. Both within‐group correlational and between‐group experimental analyses relating to internal consistency reliability, relations with external criteria, features that contribute to item difficulty, adverse impact, and examinee perceptions were performed. Results showed that GE scores were reasonably reliable, but only moderately related to the GRE quantitative section, suggesting the two tests might be tapping somewhat different skills. In the difficulty analyses, two of three item features manipulated had the predicted effect; these features were asking examinees to supply more than one correct answer and to identify whether an item was solvable. Our impact analyses detected no significant gender differences independent of those associated with the General Test. Finally, examinees were evenly divided as to whether they thought GE items provided a fairer indicator of their ability than multiple‐choice items, but as in past studies, they overwhelmingly preferred to take more conventional questions.