z-logo
Premium
Practical considerations for imputation of untyped markers in admixed populations
Author(s) -
Shriner Daniel,
Adeyemo Adebowale,
Chen Guanjie,
Rotimi Charles N.
Publication year - 2010
Publication title -
genetic epidemiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.301
H-Index - 98
eISSN - 1098-2272
pISSN - 0741-0395
DOI - 10.1002/gepi.20457
Subject(s) - international hapmap project , imputation (statistics) , haplotype estimation , statistics , 1000 genomes project , genetic association , biology , genetics , heteroscedasticity , population , haplotype , single nucleotide polymorphism , missing data , genotype , mathematics , demography , gene , sociology
Imputation of genotypes for markers untyped in a study sample has become a standard approach to increase genome coverage in genome‐wide association studies at practically zero cost. Most methods for imputing missing genotypes extend previously described algorithms for inferring haplotype phase. These algorithms generally fall into three classes based on the underlying model for estimating the conditional distribution of haplotype frequencies: a cluster‐based model, a multinomial model, or a population genetics‐based model. We compared BEAGLE, PLINK, and MACH, representing the three classes of models, respectively, with specific attention to measures of imputation success and selection of the reference panel for an admixed study sample of African Americans. Based on analysis of chromosome 22 and after calibration to a fixed level of 90% concordance between experimentally determined and imputed genotypes, MACH yielded the largest absolute number of successfully imputed markers and the largest gain in coverage of the variation captured by HapMap reference panels. Following the common practice of performing imputation once, the Yoruba in Ibadan, Nigeria (YRI) reference panel outperformed other HapMap reference panels, including (1) African ancestry from Southwest USA (ASW) data, (2) an unweighted combination of the Northern and Western Europe (CEU) and YRI data into a single reference panel, and (3) a combination of the CEU and YRI data into a single reference panel with weights matching estimates of admixture proportions. For our admixed study sample, the optimal strategy involved imputing twice with the HapMap CEU and YRI reference panels separately and then merging the data sets. Genet. Epidemiol . 34: 258–265, 2010.   © 2009 Wiley‐Liss, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here