z-logo
Premium
A novel association test for multiple secondary phenotypes from a case‐control GWAS
Author(s) -
Ray Debashree,
Basu Saonli
Publication year - 2017
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/gepi.22045
Subject(s) - covariate , genome wide association study , genetic association , type i and type ii errors , multivariate statistics , single nucleotide polymorphism , logistic regression , odds ratio , statistics , population , inference , biology , genetics , computer science , genotype , mathematics , medicine , gene , artificial intelligence , environmental health
ABSTRACT In the past decade, many genome‐wide association studies (GWASs) have been conducted to explore association of single nucleotide polymorphisms (SNPs) with complex diseases using a case‐control design. These GWASs not only collect information on the disease status (primary phenotype, D ) and the SNPs (genotypes, X ), but also collect extensive data on several risk factors and traits. Recent literature and grant proposals point toward a trend in reusing existing large case‐control data for exploring genetic associations of some additional traits (secondary phenotypes, Y ) collected during the study. These secondary phenotypes may be correlated, and a proper analysis warrants a multivariate approach. Commonly used multivariate methods are not equipped to properly account for the non‐random sampling scheme. Current ad hoc practices include analyses without any adjustment, and analyses with D adjusted as a covariate. Our theoretical and empirical studies suggest that the type I error for testing genetic association of secondary traits can be substantial when X as well as Y are associated with D , even when there is no association between X and Y in the underlying (target) population. Whether using D as a covariate helps maintain type I error depends heavily on the disease mechanism and the underlying causal structure (which is often unknown). To avoid grossly incorrect inference, we have proposed proportional odds model adjusted for propensity score (POM‐PS). It uses a proportional odds logistic regression of X on Y and adjusts estimated conditional probability of being diseased as a covariate. We demonstrate the validity and advantage of POM‐PS, and compare to some existing methods in extensive simulation experiments mimicking plausible scenarios of dependency among Y , X , and D . Finally, we use POM‐PS to jointly analyze four adiposity traits using a type 2 diabetes (T2D) case‐control sample from the population‐based Metabolic Syndrome in Men (METSIM) study. Only POM‐PS analysis of the T2D case‐control sample seems to provide valid association signals.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here