Premium
Computing inter‐rater reliability and its variance in the presence of high agreement
Author(s) -
Gwet Kilem Li
Publication year - 2008
Publication title -
british journal of mathematical and statistical psychology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.157
H-Index - 51
eISSN - 2044-8317
pISSN - 0007-1102
DOI - 10.1348/000711006x126600
Subject(s) - estimator , statistics , variance (accounting) , inter rater reliability , reliability (semiconductor) , kappa , independence (probability theory) , mathematics , confidence interval , cohen's kappa , monte carlo method , econometrics , agreement , physics , rating scale , economics , power (physics) , accounting , geometry , quantum mechanics , linguistics , philosophy
Pi (π) and kappa (κ) statistics are widely used in the areas of psychiatry and psychological testing to compute the extent of agreement between raters on nominally scaled data. It is a fact that these coefficients occasionally yield unexpected results in situations known as the paradoxes of kappa. This paper explores the origin of these limitations, and introduces an alternative and more stable agreement coefficient referred to as the AC 1 coefficient. Also proposed are new variance estimators for the multiple‐rater generalized π and AC 1 statistics, whose validity does not depend upon the hypothesis of independence between raters. This is an improvement over existing alternative variances, which depend on the independence assumption. A Monte‐Carlo simulation study demonstrates the validity of these variance estimators for confidence interval construction, and confirms the value of AC 1 as an improved alternative to existing inter‐rater reliability statistics.