Premium
Computer‐Aided Detection AI Reduces Interreader Variability in Grading Hip Abnormalities With MRI
Author(s) -
Tibrewala Radhika,
Ozhinsky Eugene,
Shah Rutwik,
Flament Io,
Crossley Kay,
Srinivasan Ramya,
Souza Richard,
Link Thomas M.,
Pedoia Valentina,
Majumdar Sharmila
Publication year - 2020
Publication title -
journal of magnetic resonance imaging
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.563
H-Index - 160
eISSN - 1522-2586
pISSN - 1053-1807
DOI - 10.1002/jmri.27164
Subject(s) - medicine , receiver operating characteristic , grading (engineering) , confidence interval , osteoarthritis , magnetic resonance imaging , radiology , area under the curve , cartilage , nuclear medicine , pathology , anatomy , civil engineering , alternative medicine , engineering
Background Accurate interpretation of hip MRI is time‐intensive and difficult, prone to inter‐ and intrareviewer variability, and lacks a universally accepted grading scale to evaluate morphological abnormalities. Purpose To 1) develop and evaluate a deep‐learning‐based model for binary classification of hip osteoarthritis (OA) morphological abnormalities on MR images, and 2) develop an artificial intelligence (AI)‐based assist tool to find if using the model predictions improves interreader agreement in hip grading. Study Type Retrospective study aimed to evaluate a technical development. Population A total of 764 MRI volumes (364 patients) obtained from two studies (242 patients from LASEM [FORCe] and 122 patients from UCSF), split into a 65–25–10% train, validation, test set for network training. Field Strength/Sequence 3T MRI, 2D T 2 FSE, PD SPAIR. Assessment Automatic binary classification of cartilage lesions, bone marrow edema‐like lesions, and subchondral cyst‐like lesions using the MRNet, interreader agreement before and after using network predictions. Statistical Tests Receiver operating characteristic (ROC) curve, area under curve (AUC), specificity and sensitivity, and balanced accuracy. Results For cartilage lesions, bone marrow edema‐like lesions and subchondral cyst‐like lesions the AUCs were: 0.80 (95% confidence interval [CI] 0.65, 0.95), 0.84 (95% CI 0.67, 1.00), and 0.77 (95% CI 0.66, 0.85), respectively. The sensitivity and specificity of the radiologist for binary classification were: 0.79 (95% CI 0.65, 0.93) and 0.80 (95% CI 0.59, 1.02), 0.40 (95% CI –0.02, 0.83) and 0.72 (95% CI 0.59, 0.86), 0.75 (95% CI 0.45, 1.05) and 0.88 (95% CI 0.77, 0.98). The interreader balanced accuracy increased from 53%, 71% and 56% to 60%, 73% and 68% after using the network predictions and saliency maps. Data Conclusion We have shown that a deep‐learning approach achieved high performance in clinical classification tasks on hip MR images, and that using the predictions from the deep‐learning model improved the interreader agreement in all pathologies. Level of Evidence 3 Technical Efficacy Stage 1 J. Magn. Reson. Imaging 2020;52:1163–1172.