Premium
Sparse nonparametric regression with regularized tensor product kernel
Author(s) -
Yu Hang,
Wang Yuanjia,
Zeng Donglin
Publication year - 2020
Publication title -
stat
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.61
H-Index - 18
ISSN - 2049-1573
DOI - 10.1002/sta4.300
Subject(s) - feature selection , nonparametric statistics , kernel (algebra) , computer science , artificial intelligence , feature (linguistics) , kernel method , pattern recognition (psychology) , parametric statistics , machine learning , semiparametric regression , nonparametric regression , mathematical optimization , kernel embedding of distributions , algorithm , mathematics , support vector machine , regression analysis , econometrics , statistics , linguistics , philosophy , combinatorics
With growing interest to use black‐box machine learning for complex data with many feature variables, it is critical to obtain a prediction model that only depends on a small set of features to maximize generalizability. Therefore, feature selection remains to be an important and challenging problem in modern applications. Most of the existing methods for feature selection are based on either parametric or semiparametric models, so the resulting performance can severely suffer from model misspecification when high‐order nonlinear interactions among the features are present. A very limited number of approaches for nonparametric feature selection were proposed, but they are computationally intensive and may not even converge. In this paper, we propose a novel and computationally efficient approach for nonparametric feature selection in the regression field based on a tensor product kernel function over the feature space. The importance of each feature is governed by a parameter in the kernel function that can be efficiently computed iteratively from a modified alternating direction method of multipliers algorithm. We prove the oracle selection property of the proposed method. Finally, we demonstrate the superior performance of our approach compared with the existing methods via simulation studies and application to the prediction of Alzheimer's disease.