Premium
Identification of Sequence Variants within Experimentally Validated Protein Interaction Sites Provides New Insights into Molecular Mechanisms of Disease Development
Author(s) -
Skrlj Blaz,
Konc Janez,
Kunej Tanja
Publication year - 2017
Publication title -
molecular informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.481
H-Index - 68
eISSN - 1868-1751
pISSN - 1868-1743
DOI - 10.1002/minf.201700017
Subject(s) - uniprot , proteogenomics , computational biology , sequence (biology) , biology , sequence database , protein sequencing , identification (biology) , in silico , genetics , bioinformatics , peptide sequence , genome , genomics , gene , botany
Protein interactions (PI) underlie complex biological processes. Protein interaction partners include DNA, RNA, ions, small chemical compounds, and proteins (protein‐protein interactions; PPI). Analysis of sequence variants within regions corresponding to experimentally validated PI sites presents novel opportunities for understanding of complex diseases. Such information has not been systematically collected due to the fact that datasets are dispersed throughout databases and publications. Sequence variants and PI regions were obtained from the UniProt database. The location of the variants was compared to start and end positions of each PPI. Associations of sequence variants with phenotype were obtained from databases including COSMIC, GAD, PharmGKB, and dbSNP. We developed a catalogue of 603 sequence variants located within regions corresponding to experimentally validated PI sites, mostly PPI regions. These sequence variants were previously associated with risk for cancer, reproduction, ageing, renal, and immune system diseases. The developed catalogue connects information from different research papers and databases, represents a new layer of information and enables designing new hypotheses. It provides a baseline for prioritization of sequence variants, which may affect protein function and binding sites. The study contributes to the development of the proteogenomics field and provides new insights for understanding molecular mechanisms underlying disease development.