
Prediction of hot spots in protein interfaces using extreme learning machines with the information of spatial neighbour residues
Author(s) -
Wang Lin,
Zhang Wenjuan,
Gao Qiang,
Xiong Congcong
Publication year - 2014
Publication title -
iet systems biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.367
H-Index - 50
eISSN - 1751-8857
pISSN - 1751-8849
DOI - 10.1049/iet-syb.2013.0049
Subject(s) - alanine scanning , random forest , residue (chemistry) , computer science , k nearest neighbors algorithm , artificial intelligence , pattern recognition (psychology) , test set , biological system , machine learning , data mining , chemistry , mutagenesis , biology , biochemistry , mutation , gene
The identification of hot spots, a small subset of protein interfaces that accounts for the majority of binding free energy, is becoming increasingly important for the research on protein–protein interaction and drug design. For each interface residue or target residue to be predicted, the authors extract hybrid features which incorporate a wide range of information of the target residue and its spatial neighbor residues, that is, the nearest contact residue in the other face (mirror‐contact residue) and the nearest contact residue in the same face (intra‐contact residue). Here, feature selection is performed using random forests to avoid over‐fitting. Thereafter, the extreme learning machine is employed to effectively integrate these hybrid features for predicting hot spots in protein interfaces. By the 5‐fold cross validation in the training set, their method can achieve accuracy (ACC) of 82.1% and Matthew's correlation coefficient (MCC) of 0.459, and outperforms some alternative machine learning methods in the comparison study. Furthermore, their method achieves ACC of 76.8% and MCC of 0.401 in the independent test set, and is more effective than the major existing hot spot predictors. Their prediction method offers a powerful tool for uncovering candidate residues in the studies of alanine scanning mutagenesis for functional protein interaction sites.