z-logo
Premium
Model selection procedure for high‐dimensional data
Author(s) -
Zhang Yongli,
Shen Xiaotong
Publication year - 2010
Publication title -
statistical analysis and data mining: the asa data science journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.381
H-Index - 33
eISSN - 1932-1872
pISSN - 1932-1864
DOI - 10.1002/sam.10088
Subject(s) - bayesian information criterion , model selection , selection (genetic algorithm) , information criteria , feature selection , computer science , bayesian probability , regression analysis , sample size determination , data mining , mathematics , statistics , artificial intelligence , machine learning
For high‐dimensional regression, the number of predictors may greatly exceed the sample size but only a small fraction of them are related to the response. Therefore, variable selection is inevitable, where consistent model selection is the primary concern. However, conventional consistent model selection criteria like Bayesian information criterion (BIC) may be inadequate due to their nonadaptivity to the model space and infeasibility of exhaustive search. To address these two issues, we establish a probability lower bound of selecting the smallest true model by an information criterion, based on which we propose a model selection criterion, what we call RIC c , which adapts to the model space. Furthermore, we develop a computationally feasible method combining the computational power of least angle regression (LAR) with that of RIC c . Both theoretical and simulation studies show that this method identifies the smallest true model with probability converging to one if the smallest true model is selected by LAR. The proposed method is applied to real data from the power market and outperforms the backward variable selection in terms of price forecasting accuracy. Copyright © 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 350‐358, 2010

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here