z-logo
Premium
Performance and Robustness of Penalized and Unpenalized Methods for Genetic Prediction of Complex Human Disease
Author(s) -
Abraham Gad,
Kowalczyk Adam,
Zobel Justin,
Inouye Michael
Publication year - 2013
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/gepi.21698
Subject(s) - elastic net regularization , linkage disequilibrium , lasso (programming language) , computer science , snp , robustness (evolution) , feature selection , leverage (statistics) , regression , logistic regression , artificial intelligence , machine learning , single nucleotide polymorphism , statistics , biology , mathematics , genetics , genotype , world wide web , gene
A central goal of medical genetics is to accurately predict complex disease from genotypes. Here, we present a comprehensive analysis of simulated and real data using lasso and elastic‐net penalized support‐vector machine models, a mixed‐effects linear model, a polygenic score, and unpenalized logistic regression. In simulation, the sparse penalized models achieved lower false‐positive rates and higher precision than the other methods for detecting causal SNPs. The common practice of prefiltering SNP lists for subsequent penalized modeling was examined and shown to substantially reduce the ability to recover the causal SNPs. Using genome‐wide SNP profiles across eight complex diseases within cross‐validation, lasso and elastic‐net models achieved substantially better predictive ability in celiac disease, type 1 diabetes, and Crohn's disease, and had equivalent predictive ability in the rest, with the results in celiac disease strongly replicating between independent datasets. We investigated the effect of linkage disequilibrium on the predictive models, showing that the penalized methods leverage this information to their advantage, compared with methods that assume SNP independence. Our findings show that sparse penalized approaches are robust across different disease architectures, producing as good as or better phenotype predictions and variance explained. This has fundamental ramifications for the selection and future development of methods to genetically predict human disease.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here