Premium
Score test variable screening
Author(s) -
Zhao Sihai Dave,
Li Yi
Publication year - 2014
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/biom.12209
Subject(s) - resampling , computer science , quantile regression , regression , quantile , regression analysis , variable (mathematics) , machine learning , data mining , statistics , artificial intelligence , mathematics , mathematical analysis
Summary Variable screening has emerged as a crucial first step in the analysis of high‐throughput data, but existing procedures can be computationally cumbersome, difficult to justify theoretically, or inapplicable to certain types of analyses. Motivated by a high‐dimensional censored quantile regression problem in multiple myeloma genomics, this article makes three contributions. First, we establish a score test‐based screening framework, which is widely applicable, extremely computationally efficient, and relatively simple to justify. Secondly, we propose a resampling‐based procedure for selecting the number of variables to retain after screening according to the principle of reproducibility. Finally, we propose a new iterative score test screening method which is closely related to sparse regression. In simulations we apply our methods to four different regression models and show that they can outperform existing procedures. We also apply score test screening to an analysis of gene expression data from multiple myeloma patients using a censored quantile regression model to identify high‐risk genes.