Premium
MILFM: Multiple index latent factor model based on high‐dimensional features
Author(s) -
Yang Hojin,
Zhu Hongtu,
Ibrahim Joseph G.
Publication year - 2018
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/biom.12866
Subject(s) - covariate , computer science , consistency (knowledge bases) , data mining , factor analysis , latent variable , set (abstract data type) , latent variable model , key (lock) , machine learning , index (typography) , data set , artificial intelligence , factor (programming language) , world wide web , programming language , computer security
Summary The aim of this article is to develop a multiple‐index latent factor modeling (MILFM) framework to build an accurate prediction model for clinical outcomes based on a massive number of features. We develop a three‐stage estimation procedure to build the prediction model. MILFM uses an independent screening method to select a set of informative features, which may have a complex nonlinear relationship with the outcome variables. Moreover, we develop a latent factor model to project all informative predictors onto a small number of local subspaces, which lead to a few key features that capture reliable and informative covariate information. Finally, we fit the regularized empirical estimate to those key features in order to accurately predict clinical outcomes. We systematically investigate the theoretical properties of MILFM, such as risk bounds and selection consistency. Our simulation results and real data analysis show that MILFM outperforms many state‐of‐the‐art methods in terms of prediction accuracy.