z-logo
open-access-imgOpen Access
A Stroke Risk Detection: Improving Hybrid Feature Selection Method
Author(s) -
Yonglai Zhang,
Yaojian Zhou,
Dongsong Zhang,
Wenai Song
Publication year - 2019
Publication title -
jmir. journal of medical internet research/journal of medical internet research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.446
H-Index - 142
eISSN - 1439-4456
pISSN - 1438-8871
DOI - 10.2196/12437
Subject(s) - feature selection , weighting , ranking (information retrieval) , computer science , feature (linguistics) , artificial intelligence , youden's j statistic , data mining , gold standard (test) , pattern recognition (psychology) , statistics , machine learning , medicine , receiver operating characteristic , mathematics , linguistics , philosophy , radiology
Background Stroke is one of the most common diseases that cause mortality. Detecting the risk of stroke for individuals is critical yet challenging because of a large number of risk factors for stroke. Objective This study aimed to address the limitation of ineffective feature selection in existing research on stroke risk detection. We have proposed a new feature selection method called weighting- and ranking-based hybrid feature selection (WRHFS) to select important risk factors for detecting ischemic stroke. Methods WRHFS integrates the strengths of various filter algorithms by following the principle of a wrapper approach. We employed a variety of filter-based feature selection models as the candidate set, including standard deviation, Pearson correlation coefficient, Fisher score, information gain, Relief algorithm, and chi-square test and used sensitivity, specificity, accuracy, and Youden index as performance metrics to evaluate the proposed method. Results This study chose 792 samples from the electronic records of 13,421 patients in a community hospital. Each sample included 28 features (24 blood test features and 4 demographic features). The results of evaluation showed that the proposed method selected 9 important features out of the original 28 features and significantly outperformed baseline methods. Their cumulative contribution was 0.51. The WRHFS method achieved a sensitivity of 82.7% (329/398), specificity of 80.4% (317/394), classification accuracy of 81.5% (645/792), and Youden index of 0.63 using only the top 9 features. We have also presented a chart for visualizing the risk of having ischemic strokes. Conclusions This study has proposed, developed, and evaluated a new feature selection method for identifying the most important features for building effective and parsimonious models for stroke risk detection. The findings of this research provide several novel research contributions and practical implications.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here