Premium
Effectiveness of Shrinkage and Variable Selection Methods for the Prediction of Complex Human Traits using Data from Distantly Related Individuals
Author(s) -
Berger Swetlana,
PérezRodríguez Paulino,
Veturi Yogasudha,
Simianer Henner,
los Campos Gustavo
Publication year - 2015
Publication title -
annals of human genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.537
H-Index - 77
eISSN - 1469-1809
pISSN - 0003-4800
DOI - 10.1111/ahg.12099
Subject(s) - quantitative trait locus , heritability , biology , linkage disequilibrium , trait , genome wide association study , selection (genetic algorithm) , genetic association , genetics , single nucleotide polymorphism , genetic architecture , regression , statistics , genotype , mathematics , gene , computer science , artificial intelligence , programming language
Summary Genome‐wide association studies (GWAS) have detected large numbers of variants associated with complex human traits and diseases. However, the proportion of variance explained by GWAS‐significant single nucleotide polymorphisms has been usually small. This brought interest in the use of whole‐genome regression (WGR) methods. However, there has been limited research on the factors that affect prediction accuracy (PA) of WGRs when applied to human data of distantly related individuals. Here, we examine, using real human genotypes and simulated phenotypes, how trait complexity, marker‐quantitative trait loci (QTL) linkage disequilibrium (LD), and the model used affect the performance of WGRs. Our results indicated that the estimated rate of missing heritability is dependent on the extent of marker‐QTL LD. However, this parameter was not greatly affected by trait complexity. Regarding PA our results indicated that: (a) under perfect marker‐QTL LD WGR can achieve moderately high prediction accuracy, and with simple genetic architectures variable selection methods outperform shrinkage procedures and (b) under imperfect marker‐QTL LD, variable selection methods can achieved reasonably good PA with simple or moderately complex genetic architectures; however, the PA of these methods deteriorated as trait complexity increases and with highly complex traits variable selection and shrinkage methods both performed poorly. This was confirmed with an analysis of human height.