Open Access
Integrating comprehensive functional annotations to boost power and accuracy in gene-based association analysis
Author(s) -
Corbin Quick,
Xiaoquan Wen,
Gonçalo R. Abecasis,
Michael Boehnke,
Hyun Min Kang
Publication year - 2020
Publication title -
plos genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.587
H-Index - 233
eISSN - 1553-7404
pISSN - 1553-7390
DOI - 10.1371/journal.pgen.1009060
Subject(s) - genome wide association study , genetic association , biology , computational biology , annotation , gene , gene regulatory network , biobank , gene annotation , association test , genetics , statistical power , genome , single nucleotide polymorphism , genotype , gene expression , statistics , mathematics
Gene-based association tests aggregate genotypes across multiple variants for each gene, providing an interpretable gene-level analysis framework for genome-wide association studies (GWAS). Early gene-based test applications often focused on rare coding variants; a more recent wave of gene-based methods, e.g. TWAS, use eQTLs to interrogate regulatory associations. Regulatory variants are expected to be particularly valuable for gene-based analysis, since most GWAS associations to date are non-coding. However, identifying causal genes from regulatory associations remains challenging and contentious. Here, we present a statistical framework and computational tool to integrate heterogeneous annotations with GWAS summary statistics for gene-based analysis, applied with comprehensive coding and tissue-specific regulatory annotations. We compare power and accuracy identifying causal genes across single-annotation, omnibus, and annotation-agnostic gene-based tests in simulation studies and an analysis of 128 traits from the UK Biobank, and find that incorporating heterogeneous annotations in gene-based association analysis increases power and performance identifying causal genes.