Premium
Reproducibility of quantitative high‐throughput BI ‐ RADS features extracted from ultrasound images of breast cancer
Author(s) -
Hu Yuzhou,
Qiao Mengyun,
Guo Yi,
Wang Yuanyuan,
Yu Jinhua,
Li Jiawei,
Chang Cai
Publication year - 2017
Publication title -
medical physics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.473
H-Index - 180
eISSN - 2473-4209
pISSN - 0094-2405
DOI - 10.1002/mp.12275
Subject(s) - reproducibility , concordance correlation coefficient , computer science , bi rads , support vector machine , artificial intelligence , segmentation , breast cancer , breast imaging , pattern recognition (psychology) , breast ultrasound , ultrasound , pearson product moment correlation coefficient , concordance , mammography , cancer , mathematics , radiology , medicine , statistics
Purpose Digital Breast Imaging Reporting and Data System ( BI ‐ RADS ) features extracted from ultrasound images are essential in computer‐aided diagnosis, prediction, and prognosis of breast cancer. This study focuses on the reproducibility of quantitative high‐throughput BI ‐ RADS features in the presence of variations due to different segmentation results, various ultrasound machine models, and multiple ultrasound machine settings. Methods Dataset 1 consists of 399 patients with invasive breast cancer and is used as the training set to measure the reproducibility of features, while dataset 2 consists of 138 other patients and is a validation set used to evaluate the diagnosis performances of the final reproducible features. Four hundred and sixty high‐throughput BI ‐ RADS features are designed and quantized according to BI ‐ RADS lexicon. Concordance Correlation Coefficient ( CCC ) and Deviation (Dev) are used to assess the effect of the segmentation methods and Between‐class Distance ( BD ) is used to study the influences of the machine models. In addition, the features jointly shared by two methodologies are further investigated on their effects with multiple machine settings. Subsequently, the absolute value of Pearson Correlation Coefficient (R abs ) is applied for redundancy elimination. Finally, the features that are reproducible and not redundant are preserved as the stable feature set. A 10‐fold Support Vector Machine ( SVM ) classifier is employed to verify the diagnostic ability. Results One hundred and fifty‐three features were found to have high reproducibility ( CCC > 0.9 & Dev < 0.1) within the manual and automatic segmentation. Three hundred and thirty‐nine features were stable ( BD < 0.2) at different machine models. Two feature sets shared the same 102 features, in which nine features were highly sensitive to the machine settings. Forty‐six features were finally preserved after redundancy elimination. For the validation in dataset 2, the area under curve ( AUC ) of the 10‐fold SVM classifier was 0.915. Conclusions Three factors, segmentation results, machine models, and machine settings may affect the reproducibility of high‐throughput BI ‐ RADS features to various degrees. Our 46 reproducible features were robust to these factors and were capable of distinguishing benign and malignant breast tumors.