z-logo
open-access-imgOpen Access
Screening for technical flaws in multiple-choice items. A generalizability study.
Author(s) -
Lotte Dyhrberg O’Neill,
Sara Mathilde Radl Mortensen,
Cita Nørgaard,
Anne Lindebo Holm,
Ulla Glenert Friis
Publication year - 2019
Publication title -
dansk universitetspædagogisk tidsskrift
Language(s) - English
Resource type - Journals
eISSN - 2245-1374
pISSN - 1901-5089
DOI - 10.7146/dut.v14i26.106496
Subject(s) - generalizability theory , reliability (semiconductor) , test (biology) , psychology , applied psychology , quality (philosophy) , internal validity , computer science , statistics , mathematics , developmental psychology , paleontology , power (physics) , philosophy , physics , epistemology , quantum mechanics , biology
Construction errors in multiple-choice items are quite prevalent and constitute threats to test validity of multiple-choice tests. Currently very little research on the usefulness of systematic item screening by local review committees before test administration seem to exist. The aim of this study was therefore to examine validity and feasibility aspects of review committee screening for item flaws. We examined the reliability of item reviewers’ independent judgments of the presence/absence of item flaws with a generalizability study design and found only moderate reliability using five reviewers. Statistical analyses of actual exam scores could be a more efficient way of identifying flaws and improving average item discrimination of tests in local contexts. The question of validity of human judgments of item flaws is important - not just for sufficiently sound quality assurance procedures of tests in local test contexts - but also for the global research on item flaws.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here