z-logo
open-access-imgOpen Access
ANALYZING THE OPTION EFFECTS OF DIFFICULT TOEFL ITEMS WITH LOW BISERIALS: METHODS DEVELOPED FOR USE BY TEST ASSEMBLERS
Author(s) -
Hicks Marilyn M.
Publication year - 1988
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/j.2330-8516.1988.tb00271.x
Subject(s) - test (biology) , computer science , test of english as a foreign language , construct (python library) , econometrics , statistics , item response theory , measure (data warehouse) , psychology , data mining , mathematics , psychometrics , language assessment , paleontology , biology , programming language , pedagogy
Several exploratory analyses of the fifths data generated by TOEFL item analyses were developed in order to evaluate the effects of options on the discriminability of difficult items, and to identify difficult items with low, unreliable biserials which have been rejected by Test Development but for which acceptable a‐parameters are probably estimable. Intended for use by test assemblers subsequent to an item analysis, the methods were mainly graphical, but also included the evaluation of a distance measure and other simple statistics. An effective distracter has the property that examinees are attracted to it in inverse order of ability. To the extent that this ordering is violated for certain ability levels, localized option effects occur which can impair item discrimination as well as the fit of the IRT model. The negative impact of these effects on model fit was illustrated, and methods for analyzing them were suggested. If item writers could account for the factors underlying the interaction between ability level and option responses, it might be possible to modify options accordingly, thereby improving the measurement effectiveness of the item. Departing from the usual reliance on a single index, the approaches in these analyses included, among other things, an evaluation of the biplot generated from a correspondence analysis of the matrix of fifths information, and an analysis of the total option response configuration. Many examples of these analyses were provided. A significant limitation of the r‐biserial for very difficult items which restricts the ability of test assemblers to construct tests with effective measurement properties at high score levels was illustrated. The index developed in this study to identify such items is regarded as an interim strategy until a conventional measure of item discrimination which is optimal over the entire scale of difficulty is developed, a current critical need. The implications of introducing other dimensions into the test by items with nonmonotonic response patterns due to option effects was briefly discussed. It is possible that application of the procedures developed in the study might provide a method of excercising control over the dimensionality of the measuring instrument at the practical level of item construction.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here