z-logo
Premium
Standard setting in an objective structured clinical examination: use of global ratings of borderline performance to determine the passing score
Author(s) -
Wilkinson Tim J,
Newble David I,
Frampton Christopher M
Publication year - 2001
Publication title -
medical education
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.776
H-Index - 138
eISSN - 1365-2923
pISSN - 0308-0110
DOI - 10.1111/j.1365-2923.2001.01041.x
Subject(s) - objective structured clinical examination , competence (human resources) , construct validity , face validity , psychology , gold standard (test) , educational measurement , reliability (semiconductor) , medical education , clinical psychology , medicine , curriculum , psychometrics , social psychology , pedagogy , power (physics) , physics , quantum mechanics
Background Objective structured clinical examination (OSCE) standard‐setting procedures are not well developed and are often time‐consuming and complex. We report an evaluation of a simple ‘contrasting groups’ method, applied to an OSCE conducted simultaneously in three separate schools. Subjects Medical students undertaking an end‐of‐fifth year multidisciplinary OSCE. Methods Using structured marking sheets, pairs of examiners independently scored student performance at each OSCE station. Examiners also provided a global rating of overall performance. The actual scores of any borderline candidates at each station were averaged to provide a passing score for each station. The passing scores for all stations were combined to become the passing score for the whole exam. Validity was determined by making comparisons with performance on other fifth‐year assessments. Reliability measures comprised interschool agreement, interexaminer agreement and interstation variability. Results The approach was simple and had face validity. There was a stronger association between the performance of borderline candidates on the OSCE and their in‐course assessments than with their performance on the written exam, giving a weak measure of construct validity in the absence of a better ‘gold standard’. There was good agreement between examiners in identifying borderline candidates. There were significant differences between schools in the borderline score for some stations, which disappeared when more than three stations were aggregated. Conclusion This practical method provided a valid and reliable competence‐based pass mark. Combining marks from all stations before determining the pass mark was more reliable than making decisions based on individual stations.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here