Premium
Deep latent variable models for generating knockoffs
Author(s) -
Liu Ying,
Zheng Cheng
Publication year - 2019
Publication title -
stat
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.61
H-Index - 18
ISSN - 2049-1573
DOI - 10.1002/sta4.260
Subject(s) - inference , latent variable , computer science , false discovery rate , artificial intelligence , machine learning , feature selection , variable (mathematics) , latent variable model , statistical inference , model selection , data mining , mathematics , statistics , mathematical analysis , biochemistry , chemistry , gene
Selective inference is an emerging field in big data analytics; it targets on conducting variable selection and providing statistical inference at the same time. Among various selective inference frameworks, the model‐X framework offers the most flexible tool to equip almost any machine learning method with the ability for false discovery rate (FDR) controlled variable selection. This paper provides a practical and flexible approach to generate knockoffs. We propose to fit a latent variable model for generating knockoffs. Under general conditions, the knockoffs can be generated by approximate inference of a latent variable, which captures all the correlation of predictors. We propose an algorithm based on recent advancement in stochastic variational inference to approximately reconstruct the distribution of data via the latent variables. We demonstrate that our proposed method can achieve FDR control and better power than existing knockoff generation methods in various simulated settings and a real data example for finding mutations associated with drug resistance in human immunodeficiency virus type 1 patients.