
How many raters are needed for a reliable diagnosis?
Author(s) -
Noda Art M.,
Kraemer Helena Chmura,
Yesavage Jerome A.,
Periyakoil Vyjeyanthi S.
Publication year - 2001
Publication title -
international journal of methods in psychiatric research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.275
H-Index - 73
eISSN - 1557-0657
pISSN - 1049-8931
DOI - 10.1002/mpr.107
Subject(s) - kappa , reliability (semiconductor) , jackknife resampling , confidence interval , cohen's kappa , inter rater reliability , medical diagnosis , statistics , mathematics , set (abstract data type) , psychology , medicine , computer science , pathology , power (physics) , physics , geometry , rating scale , quantum mechanics , estimator , programming language
If each of a sample of patients is evaluated by enough raters and is independently diagnosed as either positive or negative, we can evaluate the reliability (kappa coefficient and corresponding confidence interval) of each consensus of 2, 3, 4 … M raters. We can select the optimal consensus, and demonstrate an increase in reliability with multiple diagnoses. Results indicate that the majority rule (for example, two out of three, three out of five raters) does not always yield the highest reliability, nor does any other single rule, which leaves determination of the optimal consensus to empirical evaluation. The kappa coefficient and the confidence intervals are calculated using a jackknife technique for these cases and the optimal consensus is determined. Copyright © 2001 Whurr Publishers Ltd.