Premium
Moderating Possibly Irrelevant Multiple Mean Score Differences on a Test of Mathematical Reasoning
Author(s) -
Stocking Martha L.,
Jirele Thomas,
Lewis Charles,
Swanson Len
Publication year - 1998
Publication title -
journal of educational measurement
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.917
H-Index - 47
eISSN - 1745-3984
pISSN - 0022-0655
DOI - 10.1111/j.1745-3984.1998.tb00534.x
Subject(s) - test (biology) , test score , moderation , psychology , population , statistics , correlation , mathematics , social psychology , demography , mathematics education , standardized test , paleontology , biology , geometry , sociology
A pool of items from operational tests of mathematical reasoning was constructed to investigate the feasibility of using automated test assembly (ATA) methods to simultaneously moderate possibly irrelevant differences between the performance of women and men, and African American and White test takers. None of the artificial tests exhibited substantial impact moderation, although the estimated mean scaled score differences for the relevant population indicated a modest move in the intended direction: the difference between scaled score means was reduced by about 20% for women and men and about 9% for African American and White test takers. Although many issues in the implementation of this methodology remain to be solved, the consideration of impact in ATA, along with the maintenance of the detailed test plan, appears to be a potential method of moderating possibly irrelevant mean test score differences.