z-logo
Premium
Improving lung cancer prognosis assessment by incorporating synthetic minority oversampling technique and score fusion method
Author(s) -
Yan Shiju,
Qian Wei,
Guan Yubao,
Zheng Bin
Publication year - 2016
Publication title -
medical physics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.473
H-Index - 180
eISSN - 2473-4209
pISSN - 0094-2405
DOI - 10.1118/1.4948499
Subject(s) - oversampling , receiver operating characteristic , artificial intelligence , feature selection , cross validation , lung cancer , stage (stratigraphy) , pattern recognition (psychology) , radiomics , medicine , computer science , machine learning , oncology , computer network , paleontology , bandwidth (computing) , biology
Purpose: This study aims to investigate the potential to improve lung cancer recurrence risk prediction performance for stage I NSCLS patients by integrating oversampling, feature selection, and score fusion techniques and develop an optimal prediction model. Methods: A dataset involving 94 early stage lung cancer patients was retrospectively assembled, which includes CT images, nine clinical and biological (CB) markers, and outcome of 3‐yr disease‐free survival (DFS) after surgery. Among the 94 patients, 74 remained DFS and 20 had cancer recurrence. Applying a computer‐aided detection scheme, tumors were segmented from the CT images and 35 quantitative image (QI) features were initially computed. Two normalized Gaussian radial basis function network (RBFN) based classifiers were built based on QI features and CB markers separately. To improve prediction performance, the authors applied a synthetic minority oversampling technique (SMOTE) and a BestFirst based feature selection method to optimize the classifiers and also tested fusion methods to combine QI and CB based prediction results. Results: Using a leave‐one‐case‐out cross‐validation ( K ‐fold cross‐validation) method, the computed areas under a receiver operating characteristic curve (AUCs) were 0.716 ± 0.071 and 0.642 ± 0.061, when using the QI and CB based classifiers, respectively. By fusion of the scores generated by the two classifiers, AUC significantly increased to 0.859 ± 0.052 ( p < 0.05) with an overall prediction accuracy of 89.4%. Conclusions: This study demonstrated the feasibility of improving prediction performance by integrating SMOTE, feature selection, and score fusion techniques. Combining QI features and CB markers and performing SMOTE prior to feature selection in classifier training enabled RBFN based classifier to yield improved prediction accuracy.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here