Premium
Nonsynonymous SNPs: validation characteristics, derived allele frequency patterns, and suggestive evidence for natural selection
Author(s) -
Fredman David,
Sawyer Sarah L.,
Strömqvist Linda,
MottaguiTabar Salim,
Kidd Kenneth K.,
Wahlestedt Claes,
Chanock Stephen J.,
Brookes Anthony J.
Publication year - 2006
Publication title -
human mutation
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.981
H-Index - 162
eISSN - 1098-1004
pISSN - 1059-7794
DOI - 10.1002/humu.20289
Subject(s) - dbsnp , biology , single nucleotide polymorphism , nonsynonymous substitution , allele frequency , genetics , natural selection , selection (genetic algorithm) , snp , population , minor allele frequency , allele , evolutionary biology , genotype , gene , demography , genome , sociology , computer science , artificial intelligence
We experimentally investigated more than 1,200 entries in dbSNP that would change amino‐acids (nsSNPs), using various subsets of DNA samples drawn from 18 global populations (∼1,000 subjects in total). First, we mined the data for any SNP features that correlated with a high validation rate. Useful predictors of valid SNPs included multiple submissions to dbSNP, having a dbSNP validation statement, and being present in a low number of ESTs. Together, these features improved validation rates by almost 10‐fold. Higher‐abundance SNPs (e.g., T/C variants) also validated more frequently. Second, we considered derived alleles and noted a considerably (∼10%) increased average derived allele frequency (DAF) in Europeans vs. Africans, plus a further increase in some other populations. This was not primarily due to an SNP ascertainment bias, nor to the effects of natural selection. Instead, it can be explained as a drift‐based, progressive increase in DAF that occurs over many generations and becomes exaggerated during population bottlenecks. This observation could be used as the basis for novel DAF‐based tests for comparing demographic histories. Finally, we considered individual marker patterns and identified 37 SNPs with allele frequency variance or F ST values consistent with the effects of population‐specific natural selection. Four particularly striking clusters of these markers were apparent, and three of these coincide with genes/regions from among only several dozen such domains previously suggested by others to carry signatures of selection. Hum Mutat 27(2), 173–186, 2006. © 2006 Wiley‐Liss, Inc.