Premium
Prediction of interface residues in protein–protein complexes by a consensus neural network method: Test against NMR data
Author(s) -
Chen Huiling,
Zhou HuanXiang
Publication year - 2005
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.20514
Subject(s) - artificial neural network , test set , computer science , protein structure prediction , benchmark (surveying) , docking (animal) , protein structure , protein data bank , artificial intelligence , biological system , data mining , machine learning , chemistry , biology , biochemistry , medicine , nursing , geodesy , geography
Abstract The number of structures of protein–protein complexes deposited to the Protein Data Bank is growing rapidly. These structures embed important information for predicting structures of new protein complexes. This motivated us to develop the PPISP method for predicting interface residues in protein–protein complexes. In PPISP, sequence profiles and solvent accessibility of spatially neighboring surface residues were used as input to a neural network. The network was trained on native interface residues collected from the Protein Data Bank. The prediction accuracy at the time was 70% with 47% coverage of native interface residues. Now we have extensively improved PPISP. The training set now consisted of 1156 nonhomologous protein chains. Test on a set of 100 nonhomologous protein chains showed that the prediction accuracy is now increased to 80% with 51% coverage. To solve the problem of over‐prediction and under‐prediction associated with individual neural network models, we developed a consensus method that combines predictions from multiple models with different levels of accuracy and coverage. Applied on a benchmark set of 68 proteins for protein–protein docking, the consensus approach outperformed the best individual models by 3–8 percentage points in accuracy. To demonstrate the predictive power of cons‐PPISP, eight complex‐forming proteins with interfaces characterized by NMR were tested. These proteins are nonhomologous to the training set and have a total of 144 interface residues identified by chemical shift perturbation. cons‐PPISP predicted 174 interface residues with 69% accuracy and 47% coverage and promises to complement experimental techniques in characterizing protein–protein interfaces. Proteins 2005. © 2005 Wiley‐Liss, Inc.