Single-Nucleotide Polymorphism Bioinformatics
Author(s) -
Andrew D. Johnson
Publication year - 2009
Publication title -
circulation cardiovascular genetics
Language(s) - English
Resource type - Journals
eISSN - 1942-325X
pISSN - 1942-3268
DOI - 10.1161/circgenetics.109.872010
Subject(s) - single nucleotide polymorphism , dbsnp , snp genotyping , biology , genotyping , genetics , tag snp , framingham heart study , international hapmap project , genetic variation , genetic genealogy , bioinformatics , genotype , disease , gene , medicine , framingham risk score , population , environmental health , pathology
Recent years have seen near exponential growth in knowledge regarding genetic and genomic variation as more genomes have been sequenced, and corresponding advances and economies of scale in sequencing and genotyping technologies have reduced their relative costs. In parallel with these developments, discoveries of genes contributing to monogenic and complex diseases have rapidly advanced, and bioinformatics databases and software relating to the collection and analysis of genetic data have increased in number, size, and scope. Single-nucleotide polymorphisms (SNPs), comprising the most abundant type of genetic variation, are now the principal raw material underlying most genetic studies and databases. Although other types of variation, including indels, microsatellites, copy number variants, and epigenetic markers remain important to consider and can impact disease, SNPs are largely the easiest to ascertain and the most useful and widely applied markers in genetic studies in the modern age. Researchers and clinician-researchers are confronted with a dizzying array of software choices and increasingly large and complex datasets and databases relating to SNPs, sometimes working without assistance from a geneticist or a bioinformatician to help guide them. The principle aim of this review is to provide a comprehensive overview of available bioinformatics resources relating to human genetics research, with an emphasis on SNP-centered resources. The review also provides a resource for students seeking an introduction to SNP genetics resources and for wet laboratory molecular biologists conducting SNP-centered research who want to expand their knowledge on ways to apply SNP tools and databases. A number of important issues that affect users and developers of SNP bioinformatics resources are discussed throughout along with practical examples. Although many of the resources described have relevance and origins in the study of nonhuman species, this review focuses on human clinical applications. The review discusses basic SNP bioinformatics issues, critical databases and their uses, basic strategies and queries using APOE examples, software and tools relating to association studies, the prediction and validation of functional SNPs, and miscellaneous SNP resources. The focus is primarily on academic resources that are widely available. Supplemental Table IV provides URL links for all resources described in the text sections in order of their appearance. Readers are encouraged to download the Data Supplement which contains nearly half of the full review article text including sections on practical examples relating to APOE and functional SNP prediction. Key abbreviations and definitions often encountered in this article and other SNP-related articles, databases, and informatics tools are given in the Table.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom