Premium
Clustering biomolecular complexes by residue contacts similarity
Author(s) -
Rodrigues João P. G. L. M.,
Trellet Mikaël,
Schmitz Christophe,
Kastritis Panagiotis,
Karaca Ezgi,
Melquiond Adrien S. J.,
Bonvin Alexandre M. J. J.
Publication year - 2012
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.24078
Subject(s) - cluster analysis , similarity (geometry) , computer science , data mining , structural similarity , biological system , computational biology , biology , artificial intelligence , image (mathematics)
Abstract Inaccuracies in computational molecular modeling methods are often counterweighed by brute‐force generation of a plethora of putative solutions. These are then typically sieved via structural clustering based on similarity measures such as the root mean square deviation (RMSD) of atomic positions. Albeit widely used, these measures suffer from several theoretical and technical limitations (e.g., choice of regions for fitting) that impair their application in multicomponent systems ( N > 2), large‐scale studies (e.g., interactomes), and other time‐critical scenarios. We present here a simple similarity measure for structural clustering based on atomic contacts—the fraction of common contacts—and compare it with the most used similarity measure of the protein docking community—interface backbone RMSD. We show that this method produces very compact clusters in remarkably short time when applied to a collection of binary and multicomponent protein–protein and protein–DNA complexes. Furthermore, it allows easy clustering of similar conformations of multicomponent symmetrical assemblies in which chain permutations can occur. Simple contact‐based metrics should be applicable to other structural biology clustering problems, in particular for time‐critical or large‐scale endeavors.Proteins 2012; © 2012 Wiley Periodicals, Inc.