Similarity of the cut score in test sets with different item amounts using the modified Angoff, modified Ebel, and Hofstee standard-setting methods for the Korean Medical Licensing Examination
Author(s) -
Janghee Park,
Mi Kyoung Yim,
Na Jin Kim,
Duck Sun Ahn,
YoungMin Kim
Publication year - 2020
Publication title -
journal of educational evaluation for health professions
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.397
H-Index - 9
ISSN - 1975-5937
DOI - 10.3352/jeehp.2020.17.28
Subject(s) - set (abstract data type) , statistics , similarity (geometry) , test (biology) , reliability (semiconductor) , mathematics , significant difference , item analysis , natural language processing , psychology , computer science , artificial intelligence , psychometrics , paleontology , power (physics) , physics , quantum mechanics , image (mathematics) , biology , programming language
Purpose The Korea Medical Licensing Exam (KMLE) typically contains a large number of items. The purpose of this study was to investigate whether there is a difference in the cut score between evaluating all items of the exam and evaluating only some items when conducting standard-setting. Methods We divided the item sets that appeared on 3 recent KMLEs for the past 3 years into 4 subsets of each year of 25% each based on their item content categories, discrimination index, and difficulty index. The entire panel of 15 members assessed all the items (360 items, 100%) of the year 2017. In split-half set 1, each item set contained 184 (51%) items of year 2018 and each set from split-half set 2 contained 182 (51%) items of the year 2019 using the same method. We used the modified Angoff, modified Ebel, and Hofstee methods in the standard-setting process. Results Less than a 1% cut score difference was observed when the same method was used to stratify item subsets containing 25%, 51%, or 100% of the entire set. When rating fewer items, higher rater reliability was observed. Conclusion When the entire item set was divided into equivalent subsets, assessing the exam using a portion of the item set (90 out of 360 items) yielded similar cut scores to those derived using the entire item set. There was a higher correlation between panelists’ individual assessments and the overall assessments.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom