Premium
Imputation methods to improve inference in SNP association studies
Author(s) -
Dai James Y.,
Ruczinski Ingo,
LeBlanc Michael,
Kooperberg Charles
Publication year - 2006
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/gepi.20180
Subject(s) - imputation (statistics) , missing data , inference , computer science , genetic association , statistics , single nucleotide polymorphism , data mining , mathematics , artificial intelligence , machine learning , biology , genetics , genotype , gene
Abstract Missing single nucleotide polymorphisms (SNPs) are quite common in genetic association studies. Subjects with missing SNPs are often discarded in analyses, which may seriously undermine the inference of SNP‐disease association. In this article, we develop two haplotype‐based imputation approaches and one tree‐based imputation approach for association studies. The emphasis is to evaluate the impact of imputation on parameter estimation, compared to the standard practice of ignoring missing data. Haplotype‐based approaches build on haplotype reconstruction by the expectation‐maximization (EM) algorithm or a weighted EM (WEM) algorithm, depending on whether case‐control status is taken into account. The tree‐based approach uses a Gibbs sampler to iteratively sample from a full conditional distribution, which is obtained from the classification and regression tree (CART) algorithm. We employ a standard multiple imputation procedure to account for the uncertainty of imputation. We apply the methods to simulated data as well as a case‐control study on developmental dyslexia. Our results suggest that imputation generally improves efficiency over the standard practice of ignoring missing data. The tree‐based approach performs comparably well as haplotype‐based approaches, but the former has a computational advantage. The WEM approach yields the smallest bias at a price of increased variance. Genet. Epidemiol. © 2006 Wiley‐Liss, Inc.