z-logo
open-access-imgOpen Access
Leveraging auxiliary data from arbitrary distributions to boost GWAS discovery with Flexible cFDR
Author(s) -
Anna Hutchinson,
Guillermo Reales,
Thomas Willis,
Chris Wallace
Publication year - 2021
Publication title -
plos genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.587
H-Index - 233
eISSN - 1553-7404
pISSN - 1553-7390
DOI - 10.1371/journal.pgen.1009853
Subject(s) - genome wide association study , false discovery rate , covariate , pleiotropy , type i and type ii errors , statistical power , genetic association , computer science , parametric statistics , biobank , biology , computational biology , data mining , statistics , single nucleotide polymorphism , machine learning , bioinformatics , genetics , mathematics , gene , genotype , phenotype
Genome-wide association studies (GWAS) have identified thousands of genetic variants that are associated with complex traits. However, a stringent significance threshold is required to identify robust genetic associations. Leveraging relevant auxiliary covariates has the potential to boost statistical power to exceed the significance threshold. Particularly, abundant pleiotropy and the non-random distribution of SNPs across various functional categories suggests that leveraging GWAS test statistics from related traits and/or functional genomic data may boost GWAS discovery. While type 1 error rate control has become standard in GWAS, control of the false discovery rate can be a more powerful approach. The conditional false discovery rate (cFDR) extends the standard FDR framework by conditioning on auxiliary data to call significant associations, but current implementations are restricted to auxiliary data satisfying specific parametric distributions, typically GWAS p -values for related traits. We relax these distributional assumptions, enabling an extension of the cFDR framework that supports auxiliary covariates from arbitrary continuous distributions (“Flexible cFDR”). Our method can be applied iteratively, thereby supporting multi-dimensional covariate data. Through simulations we show that Flexible cFDR increases sensitivity whilst controlling FDR after one or several iterations. We further demonstrate its practical potential through application to an asthma GWAS, leveraging various functional genomic data to find additional genetic associations for asthma, which we validate in the larger, independent, UK Biobank data resource.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here