
Evaluating the performance of a deep learning‐based computer‐aided diagnosis (DL‐CAD) system for detecting and characterizing lung nodules: Comparison with the performance of double reading by radiologists
Author(s) -
Li Li,
Liu Zhou,
Huang Hua,
Lin Meng,
Luo Dehong
Publication year - 2019
Publication title -
thoracic cancer
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.823
H-Index - 28
eISSN - 1759-7714
pISSN - 1759-7706
DOI - 10.1111/1759-7714.12931
Subject(s) - medicine , nodule (geology) , cad , lung , lung cancer , nuclear medicine , radiology , reading (process) , pathology , paleontology , engineering drawing , engineering , biology , political science , law
Background The study was conducted to evaluate the performance of a state‐of‐the‐art commercial deep learning‐based computer‐aided diagnosis (DL‐CAD) system for detecting and characterizing pulmonary nodules. Methods Pulmonary nodules in 346 healthy subjects (male: female = 221:125, mean age 51 years) from a lung cancer screening program conducted from March to November 2017 were screened using a DL‐CAD system and double reading independently, and their performance in nodule detection and characterization were evaluated. An expert panel combined the results of the DL‐CAD system and double reading as the reference standard. Results The DL‐CAD system showed a higher detection rate than double reading, regardless of nodule size (86.2% vs. 79.2%; P < 0.001): nodules ≥ 5 mm (96.5% vs. 88.0%; P = 0.008); nodules < 5 mm (84.3% vs. 77.5%; P < 0.001). However, the false positive rate (per computed tomography scan) of the DL‐CAD system (1.53, 529/346) was considerably higher than that of double reading (0.13, 44/346; P < 0.001). Regarding nodule characterization, the sensitivity and specificity of the DL‐CAD system for distinguishing solid nodules > 5 mm (90.3% and 100.0%, respectively) and ground‐glass nodules (100.0% and 96.1%, respectively) were close to that of double reading, but dropped to 55.5% and 93%, respectively, when discriminating part solid nodules. Conclusion Our DL‐CAD system detected significantly more nodules than double reading. In the future, false positive findings should be further reduced and characterization accuracy improved.