Premium
Human leucocyte antigen class I and II imputation in a multiracial population
Author(s) -
Kuniholm M. H.,
Xie X.,
Anastos K.,
Xue X.,
Reimers L.,
French A. L.,
Gange S. J.,
Kassaye S. G.,
Kovacs A.,
Wang T.,
Aouizerat B. E.,
Strickler H. D.
Publication year - 2016
Publication title -
international journal of immunogenetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.41
H-Index - 47
eISSN - 1744-313X
pISSN - 1744-3121
DOI - 10.1111/iji.12292
Subject(s) - imputation (statistics) , human leukocyte antigen , genotyping , snp , genome wide association study , 1000 genomes project , allele , population , genetics , genetic association , biology , genotype , single nucleotide polymorphism , gene , antigen , medicine , statistics , missing data , mathematics , environmental health
Summary Human leucocyte antigen ( HLA ) genes play a central role in response to pathogens and in autoimmunity. Research to understand the effects of HLA genes on health has been limited because HLA genotyping protocols are labour intensive and expensive. Recently, algorithms to impute HLA genotype data using genome‐wide association study ( GWAS ) data have been published. However, imputation accuracy for most of these algorithms was based primarily on training data sets of European ancestry individuals. We considered performance of two HLA ‐dedicated imputation algorithms – SNP 2 HLA and HIBAG – in a multiracial population of n = 1587 women with HLA genotyping data by gold standard methods. We first compared accuracy – defined as the percentage of correctly predicted alleles – of HLA ‐B and HLA ‐C imputation using SNP 2 HLA and HIBAG using a breakdown of the data set into an 80% training group and a 20% testing group. Estimates of accuracy for HIBAG were either the same or better than those for SNP 2 HLA . We then conducted a more thorough test of HIBAG imputation accuracy using five independent 10‐fold cross‐validation procedures with delineation of ancestry groups using ancestry informative markers. Overall accuracy for HIBAG was 89%. Accuracy by HLA gene was 93% for HLA ‐A, 84% for HLA ‐B, 94% for HLA ‐C, 83% for HLA ‐ DQA 1, 91% for HLA ‐ DQB 1 and 88% for HLA ‐ DRB 1. Accuracy was highest in the African ancestry group (the largest group) and lowest in the Hispanic group (the smallest group). Despite suboptimal imputation accuracy for some HLA gene/ancestry group combinations, the HIBAG algorithm has the advantage of providing posterior estimates of accuracy which enable the investigator to analyse subsets of the population with high predicted (e.g. >95%) imputation accuracy.