Premium
Efficient comprehensive scoring of docked protein complexes using probabilistic support vector machines
Author(s) -
Martin Oliver,
Schomburg Dietmar
Publication year - 2008
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.21603
Subject(s) - docking (animal) , probabilistic logic , macromolecular docking , computer science , support vector machine , protein structure prediction , protein structure , artificial intelligence , machine learning , biological system , computational biology , chemistry , biology , biochemistry , medicine , nursing
Biological systems and processes rely on a complex network of molecular interactions. While the association of biological macromolecules is a fundamental biochemical phenomenon crucial for the understanding of complex living systems, protein‐protein docking methods aim for the computational prediction of protein complexes from individual subunits. Docking algorithms generally produce large numbers of putative protein complexes with only few of these conformations resembling the native complex structure within an acceptable degree of structural similarity. A major challenge in the field of docking is to extract near‐native structure(s) out of the large pool of solutions, the so called scoring or ranking problem. A series of structural, chemical, biological and physical properties are used in this work to classify docked protein‐protein complexes. These properties include specialized energy functions, evolutionary relationship, class specific residue interface propensities, gap volume, buried surface area, empiric pair potentials on residue and atom level as well as measures for the tightness of fit. Efficient comprehensive scoring functions have been developed using probabilistic Support Vector Machines in combination with this array of properties on the largest currently available protein‐protein docking benchmark. The established classifiers are shown to be specific for certain types of protein‐protein complexes and are able to detect near‐native complex conformations from large sets of decoys with high sensitivity. Using classification probabilities the ranking of near‐native structures was drastically improved, leading to a significant enrichment of near‐native complex conformations within the top ranks. It could be shown that the developed schemes outperform five other previously published scoring functions. Proteins 2008. © 2007 Wiley‐Liss, Inc.