Identifying significant gene‐environment interactions using a combination of screening testing and hierarchical false discovery rate control | Zendy

Frost H. Robert | Zendy; Shen Li | Zendy; Saykin Andrew J. | Zendy; Williams Scott M. | Zendy; Moore Jason H. | Zendy

Premium

Identifying significant gene‐environment interactions using a combination of screening testing and hierarchical false discovery rate control

Author(s) -

Frost H. Robert,

Shen Li,

Saykin Andrew J.,

Williams Scott M.,

Moore Jason H.

Publication year - 2016

Publication title -

genetic epidemiology

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.301

H-Index - 98

eISSN - 1098-2272

pISSN - 0741-0395

DOI - 10.1002/gepi.21997

Subject(s) - type i and type ii errors , computer science , false discovery rate , multiple comparisons problem , statistical power , statistic , statistical hypothesis testing , test statistic , regression , data mining , machine learning , statistics , mathematics , biology , gene , genetics

Although gene‐environment (G× E) interactions play an important role in many biological systems, detecting these interactions within genome‐wide data can be challenging due to the loss in statistical power incurred by multiple hypothesis correction. To address the challenge of poor power and the limitations of existing multistage methods, we recently developed a screening‐testing approach for G× E interaction detection that combines elastic net penalized regression with joint estimation to support a single omnibus test for the presence of G× E interactions. In our original work on this technique, however, we did not assess type I error control or power and evaluated the method using just a single, small bladder cancer data set. In this paper, we extend the original method in two important directions and provide a more rigorous performance evaluation. First, we introduce a hierarchical false discovery rate approach to formally assess the significance of individual G× E interactions. Second, to support the analysis of truly genome‐wide data sets, we incorporate a score statistic‐based prescreening step to reduce the number of single nucleotide polymorphisms prior to fitting the first stage penalized regression model. To assess the statistical properties of our method, we compare the type I error rate and statistical power of our approach with competing techniques using both simple simulation designs as well as designs based on real disease architectures. Finally, we demonstrate the ability of our approach to identify biologically plausible SNP‐education interactions relative to Alzheimer's disease status using genome‐wide association study data from the Alzheimer's Disease Neuroimaging Initiative (ADNI).

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research