Optimized variable selection via repeated data splitting | Zendy

Capanu Marinela | Zendy; Giurcanu Mihai | Zendy; Begg Colin B. | Zendy; Gönen Mithat | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Optimized variable selection via repeated data splitting

Author(s) -

Capanu Marinela,

Giurcanu Mihai,

Begg Colin B.,

Gönen Mithat

Publication year - 2020

Publication title -

statistics in medicine

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.996

H-Index - 183

eISSN - 1097-0258

pISSN - 0277-6715

DOI - 10.1002/sim.8538

Subject(s) - univariate , lasso (programming language) , scad , feature selection , selection (genetic algorithm) , statistics , variable (mathematics) , computer science , model selection , mathematics , machine learning , medicine , multivariate statistics , mathematical analysis , world wide web , myocardial infarction

Model selection in high‐dimensional settings has received substantial attention in recent years, however, similar advancements in the low‐dimensional setting have been lacking. In this article, we introduce a new variable selection procedure for low to moderate scale regressions ( n > p ). This method repeatedly splits the data into two sets, one for estimation and one for validation, to obtain an empirically optimized threshold which is then used to screen for variables to include in the final model. In an extensive simulation study, we show that the proposed variable selection technique enjoys superior performance compared with candidate methods (backward elimination via repeated data splitting, univariate screening at 0.05 level, adaptive LASSO, SCAD), being amongst those with the lowest inclusion of noisy predictors while having the highest power to detect the correct model and being unaffected by correlations among the predictors. We illustrate the methods by applying them to a cohort of patients undergoing hepatectomy at our institution.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research