
GRE ANALYTICAL REASONING ITEM STATISTICS PREDICTION STUDY
Author(s) -
Boldt R. F.
Publication year - 1998
Publication title -
ets research report series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.235
H-Index - 5
ISSN - 2330-8516
DOI - 10.1002/j.2333-8504.1998.tb01786.x
Subject(s) - statistics , sample (material) , linear regression , sample size determination , item response theory , computer science , econometrics , mathematics , psychometrics , chemistry , chromatography
Chalifour and Powers (1989) noted that the ability to predict item statistics might be used to reduce the volume of item pretesting (Mislevy, Sheehan, & Wingersky, 1993). It could also lead to better control of test statistics, such as item difficulty distributions, and to improved test specifications. Research, such as the previously cited study, is needed to examine the prospects for attaining these benefits. That study examined prediction of GRE analytical reasoning item statistics. Linear regression was used by these authors, but they surmised that non‐linear techniques might provide somewhat better prediction of item statistics. Chalifour and Powers had amassed item statistics on a very large sample of analytical reasoning items. That sample was used in the present study. For the present study, predictions were generated using a type of neural net, a technique that can accommodate a wide variety of non‐linear relationships, though at a cost of requiring the calculation of many constants in the prediction function. This technique did indeed provide more accurate predictions of item difficulty and item discrimination in an estimation sample. However, when the functions developed in the estimation sample were cross validated in a fresh sample, the advantages noted in the estimation sample disappeared. Only by including expert judgements of items difficulty with other predictors was the accuracy of prediction improved. The variables that produced the above results had been selected by Chalifour and Powers (1989) for their efficacy in linear prediction. The study reported here used a “genetic algorithm”, which provides a quasi‐random search of the predictor set to find an optimal set of predictors. The search used the validity of neural nets computed during operation of the algorithm to evaluate the efficacy of prediction. Hence, this technique sought variables sets that were optimal when a neural net was to be used. The genetic algorithm accomplished some improved prediction of item difficulties in the estimation, but no improvement with regard to discrimination. When the variable sets and the related nets were evaluated in the validation sample, any advantage gained by the search was lost. The validities of predictions in the validation sample were approximately equal whether or not the genetic algorithm had been used in developing the prediction functions. Examination of the root‐mean squares of the discrepancies between predicted and actual values for item difficulties (and discriminations) revealed no advantage for the neural net. This examination used the validation sample. In sum, application of the complex and computer‐intensive neural nets and genetic algorithms revealed no advantage over linear methods for predicting item difficulty and item discrimination statistics for GRE analytical reasoning items.