z-logo
Premium
Effective discrimination between biologically relevant contacts and crystal packing contacts using new determinants
Author(s) -
Luo Jiesi,
Guo Yanzhi,
Fu Yuanyuan,
Wang Yu,
Li Wenling,
Li Menglong
Publication year - 2014
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.24670
Subject(s) - crystal (programming language) , materials science , biological system , protein crystallization , core (optical fiber) , crystallography , chemistry , computer science , crystallization , biology , organic chemistry , composite material , programming language
In the structural models determined by X-ray crystallography, contacts between molecules can be divided into two categories: biologically relevant contacts and crystal packing contacts. With the growth in the number and quality of available large crystal packing contacts structures, distinguishing crystal packing contacts from biologically relevant contacts remains a difficult task, which can lead to wrong interpretation of structural models. In this study, we performed a systematic analysis on the biologically relevant contacts and crystal packing contacts. The analysis results reveal that biologically contacts are more tightly packed than crystal packing contacts. This property of biologically contacts may contribute to the formation of their interfacial core region. Meanwhile, the differences between the core and surface region of biologically contacts in amino acid composition and evolutionary measure are more dramatic than crystal packing contacts and these differences appear to be useful in distinguishing these two categories of contacts. On the basis of the features derived from our analysis, we developed a random forest model to classify biological relevant contacts and crystal packing contacts. Our method can achieve a high receiver operating curve of 0.923 in the 5-fold cross-validation and accuracies of 91.4% and 91.7% for two different test sets. Moreover, in a comparison study, our model outperforms other existing methods, such as DiMoVo, Pita, Pisa, and Eppic. We believe that this study will provide useful help in the validation of oligomeric proteins and protein complexes. The model and all data used in this paper are freely available at http://cic.scu.edu.cn/bioinformatics/bio-cry.zip.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here