Premium
A Bayesian toolkit for genetic association studies
Author(s) -
Lunn David J.,
Whittaker John C.,
Best Nicky
Publication year - 2006
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/gepi.20140
Subject(s) - covariate , missing data , computer science , bayesian probability , markov chain monte carlo , data mining , genetic association , bayes' theorem , range (aeronautics) , machine learning , artificial intelligence , genotype , biology , genetics , materials science , single nucleotide polymorphism , composite material , gene
Abstract We present a range of modelling components designed to facilitate Bayesian analysis of genetic‐association‐study data. A key feature of our approach is the ability to combine different submodels together, almost arbitrarily, for dealing with the complexities of real data. In particular, we propose various techniques for selecting the “best” subset of genetic predictors for a specific phenotype (or set of phenotypes). At the same time, we may control for complex, non‐linear relationships between phenotypes and additional (non‐genetic) covariates as well as accounting for any residual correlation that exists among multiple phenotypes. Both of these additional modelling components are shown to potentially aid in detecting the underlying genetic signal. We may also account for uncertainty regarding missing genotype data. Indeed, at the heart of our approach is a novel method for reconstructing unobserved haplotypes and/or inferring the values of missing genotypes. This can be deployed independently or, alternatively, it can be fully integrated into arbitrary genotype‐ or haplotype‐based association models such that the missing data and the association model are “estimated” simultaneously. The impact of such simultaneous analysis on inferences drawn from the association model is shown to be potentially significant. Our modelling components are packaged as an “add‐on” interface to the widely used WinBUGS software, which allows Markov chain Monte Carlo analysis of a wide range of statistical models. We illustrate their use with a series of increasingly complex analyses conducted on simulated data based on a real pharmacogenetic example. Genet. Epidemiol. 30:231–247, 2006. © 2006 Wiley‐Liss, Inc.