
The GWAS-MAP platform for aggregation of results of genome-wide association studies and the GWAS-MAP|homo database of 70 billion genetic associations of human traits
Author(s) -
Tatiana Shashkova,
Denis D Gorev,
Eugene D Pakhomov,
Alexandra S. Shadrina,
Sodbo Sharapov,
Yakov A. Tsepilov,
Lennart C. Karssen,
Yurii S. Aulchenko
Publication year - 2020
Publication title -
vavilovskij žurnal genetiki i selekcii
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.188
H-Index - 7
eISSN - 2500-0462
pISSN - 2500-3259
DOI - 10.18699/vj20.686
Subject(s) - genome wide association study , summary statistics , genetic association , data science , computer science , computational biology , database , biology , data mining , statistics , genetics , single nucleotide polymorphism , mathematics , gene , genotype
Hundreds of genome-wide association studies (GWAS) of human traits are performed each year. The results of GWAS are often published in the form of summary statistics. Information from summary statistics can be used for multiple purposes – from fundamental research in biology and genetics to the search for potential biomarkers and therapeutic targets. While the amount of GWAS summary statistics collected by the scientific community is rapidly increasing, the use of this data is limited by the lack of generally accepted standards. In particular, the researchers who would like to use GWAS summary statistics in their studies have to become aware that the data are scattered across multiple websites, are presented in a variety of formats, and, often, were not quality controlled. Moreover, each available summary statistics analysis tools will ask for data to be presented in their own internal format. To address these issues, we developed GWAS-MAP, a high-throughput platform for aggregating, storing, analyzing, visualizing and providing access to a database of big data that result from region- and genome-wide association studies. The database currently contains information on more than 70 billion associations between genetic variants and human diseases, quantitative traits, and “omics” traits. The GWAS-MAP platform and database can be used for studying the etiology of human diseases, building predictive risk models and finding potential biomarkers and therapeutic interventions. In order to demonstrate a typical application of the platform as an approach for extracting new biological knowledge and establishing mechanistic hypotheses, we analyzed varicose veins, a disease affecting on average every third adult in Russia. The results of analysis confirmed known epidemiologic associations for this disease and led us to propose a hypothesis that increased levels of MICB and CD209 proteins in human plasma may increase susceptibility to varicose veins.