Premium
Inter‐Rater Reliability of Quantifying Pleural B‐Lines Using Multiple Counting Methods
Author(s) -
Anderson Kenton L.,
Fields J. Matthew,
Panebianco Nova L.,
Jenq Katherine Y.,
Marin Jennifer,
Dean Anthony J.
Publication year - 2013
Publication title -
journal of ultrasound in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.574
H-Index - 91
eISSN - 1550-9613
pISSN - 0278-4297
DOI - 10.7863/jum.2013.32.1.115
Subject(s) - medicine , intraclass correlation , confidence interval , interclass correlation , reliability (semiconductor) , nuclear medicine , thorax (insect anatomy) , inter rater reliability , statistics , anatomy , mathematics , clinical psychology , power (physics) , rating scale , physics , quantum mechanics , psychometrics
Objectives Sonographic B‐lines are a sign of increased extravascular lung water. Several techniques for quantifying B‐lines within individual rib spaces have been described, as well as different methods for “scoring” the cumulative B‐line counts over the entire thorax. The interobserver reliability of these methods is unknown. This study examined 3 methods of quantifying B‐lines for inter‐rater reliability. Methods Videotaped pleural assessments of adult patients presenting to the emergency department with dyspnea and suspected acute heart failure were reviewed by 3 blinded pairs of emergency physicians. Each pair performed B‐line counts within single rib spaces using 1 of the following 3 predetermined methods: 1, individual B‐lines are counted over an entire respiratory cycle; 2, as per method 1, but confluent B‐lines are counted as multiple based on the percentage of the rib space they occupy; and 3, as per method 2, but the count is made at the moment when the most B‐lines are seen, not over an entire respiratory cycle. A single‐measures interclass correlation coefficient was used to assess inter‐rater reliability for the 3 definitions of B‐line counts. Results A total of 456 video clips were reviewed. The interclass correlation coefficients (95% confidence intervals) for methods 1, 2, and 3 were 0.84 (0.81–0.87), 0.87 (0.85–0.90), and 0.89 (0.87–0.91), respectively. The difference between methods 1 and 3 was significant ( P = .003). Conclusions All methods of B‐line quantification showed substantial inter‐rater agreement. Method 3 is more reliable than method 1. There were no other significant differences between the methods. We recommend the use of method 3 because it is technically simpler to perform and more reliable than method 1.