Premium
Reliability and validity of three quality rating instruments for systematic reviews of observational studies
Author(s) -
Hootman Jennifer M.,
Driban Jeffrey B.,
Sitler Michael R.,
Harris Kyle P.,
Cattano Nicole M.
Publication year - 2011
Publication title -
research synthesis methods
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.376
H-Index - 35
eISSN - 1759-2887
pISSN - 1759-2879
DOI - 10.1002/jrsm.41
Subject(s) - observational study , intraclass correlation , reliability (semiconductor) , validity , inter rater reliability , rating scale , sign (mathematics) , statistics , psychology , medicine , clinical psychology , psychometrics , mathematics , physics , mathematical analysis , power (physics) , quantum mechanics
To assess the inter‐rater reliability, validity, and inter‐instrument agreement of the three quality rating instruments for observational studies. Inter‐rater reliability, criterion validity, and inter‐instrument reliability were assessed for three quality rating scales, the Downs and Black (D&B), Newcastle–Ottawa (NOS), and Scottish Intercollegiate Guidelines Network (SIGN), using a sample of 23 observational studies of musculoskeletal health outcomes. Inter‐rater reliability for the D&B (Intraclass correlations [ICC] = 0.73; CI = 0.47–0.88) and NOS (ICC = 0.52; CI = 0.14–0.76) were moderate to good and was poor for the SIGN ( κ = 0.09; CI = −0.22–0.40). The NOS was not statistically valid ( p = 0.35), although the SIGN was statistically valid ( p < 0.05) with medium to large effect sizes ( f 2 = 0.29–0.47). Inter‐instrument agreement estimates were κ = 0.34, CI = 0.05–0.62 (D&B versus SIGN), κ = 0.26, CI = 0.00–0.52 (SIGN versus NOS), and κ = 0.43, CI = 0.09–0.78 (D&B versus NOS). Reliability and validity are quite variable across quality rating scales used in assessing observational studies in systematic reviews. Copyright © 2011 John Wiley & Sons, Ltd.