KFC2: A knowledge‐based hot spot prediction method based on interface solvation, atomic density, and plasticity features | Zendy

Zhu Xiaolei | Zendy; Mitchell Julie C. | Zendy

Premium

KFC2: A knowledge‐based hot spot prediction method based on interface solvation, atomic density, and plasticity features

Author(s) -

Zhu Xiaolei,

Mitchell Julie C.

Publication year - 2011

Publication title -

proteins: structure, function, and bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.699

H-Index - 191

eISSN - 1097-0134

pISSN - 0887-3585

DOI - 10.1002/prot.23094

Subject(s) - hot spot (computer programming) , support vector machine , false positive rate , solvation , computer science , test set , feature (linguistics) , set (abstract data type) , training set , pattern recognition (psychology) , artificial intelligence , false discovery rate , biological system , data mining , chemistry , solvent , biology , linguistics , philosophy , biochemistry , organic chemistry , gene , programming language , operating system

Hot spots constitute a small fraction of protein–protein interface residues, yet they account for a large fraction of the binding affinity. Based on our previous method (KFC), we present two new methods (KFC2a and KFC2b) that outperform other methods at hot spot prediction. A number of improvements were made in developing these new methods. First, we created a training data set that contained a similar number of hot spot and non‐hot spot residues. In addition, we generated 47 different features, and different numbers of features were used to train the models to avoid over‐fitting. Finally, two feature combinations were selected: One (used in KFC2a) is composed of eight features that are mainly related to solvent accessible surface area and local plasticity; the other (KFC2b) is composed of seven features, only two of which are identical to those used in KFC2a. The two models were built using support vector machines (SVM). The two KFC2 models were then tested on a mixed independent test set, and compared with other methods such as Robetta, FOLDEF, HotPoint, MINERVA, and KFC. KFC2a showed the highest predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.85); however, the false positive rate was somewhat higher than for other models. KFC2b showed the best predictive accuracy for hot spot residues (True Positive Rate: TPR = 0.62) among all methods other than KFC2a, and the False Positive Rate (FPR = 0.15) was comparable with other highly predictive methods. Proteins 2011. © 2011 Wiley‐Liss, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research