Optimal Unified Approach for Rare-Variant Association Testing with Application to Small-Sample Case-Control Whole-Exome Sequencing Studies
Author(s) -
Seunggeun Lee,
Mary J. Emond,
Michael J. Bamshad,
Kathleen C. Barnes,
Mark J. Rieder,
Deborah A. Nickerson,
David C. Christiani,
Mark M. Wurfel,
Xihong Lin
Publication year - 2012
Publication title -
the american journal of human genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.661
H-Index - 302
eISSN - 1537-6605
pISSN - 0002-9297
DOI - 10.1016/j.ajhg.2012.06.007
Subject(s) - type i and type ii errors , exome sequencing , sample size determination , association test , exome , genetic association , computer science , association (psychology) , statistical power , computational biology , data mining , statistics , biology , genetics , mathematics , phenotype , single nucleotide polymorphism , genotype , psychology , gene , psychotherapist
We propose in this paper a unified approach for testing the association between rare variants and phenotypes in sequencing association studies. This approach maximizes power by adaptively using the data to optimally combine the burden test and the nonburden sequence kernel association test (SKAT). Burden tests are more powerful when most variants in a region are causal and the effects are in the same direction, whereas SKAT is more powerful when a large fraction of the variants in a region are noncausal or the effects of causal variants are in different directions. The proposed unified test maintains the power in both scenarios. We show that the unified test corresponds to the optimal test in an extended family of SKAT tests, which we refer to as SKAT-O. The second goal of this paper is to develop a small-sample adjustment procedure for the proposed methods for the correction of conservative type I error rates of SKAT family tests when the trait of interest is dichotomous and the sample size is small. Both small-sample-adjusted SKAT and the optimal unified test (SKAT-O) are computationally efficient and can easily be applied to genome-wide sequencing association studies. We evaluate the finite sample performance of the proposed methods using extensive simulation studies and illustrate their application using the acute-lung-injury exome-sequencing data of the National Heart, Lung, and Blood Institute Exome Sequencing Project.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom