z-logo
Premium
Clustering algorithms for identifying core atom sets and for assessing the precision of protein structure ensembles
Author(s) -
Snyder David A.,
Montelione Gaetano T.
Publication year - 2005
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.20402
Subject(s) - algorithm , superimposition , atom (system on chip) , cluster analysis , measure (data warehouse) , set (abstract data type) , force field (fiction) , kurtosis , partition (number theory) , computer science , mathematics , data mining , artificial intelligence , combinatorics , statistics , embedded system , programming language
An important open question in the field of NMR‐based biomolecular structure determination is how best to characterize the precision of the resulting ensemble of structures. Typically, the RMSD, as minimized in superimposing the ensemble of structures, is the preferred measure of precision. However, the presence of poorly determined atomic coordinates and multiple “RMSD‐stable domains”—locally well‐defined regions that are not aligned in global superimpositions—complicate RMSD calculations. In this paper, we present a method, based on a novel, structurally defined order parameter, for identifying a set of core atoms to use in determining superimpositions for RMSD calculations. In addition we present a method for deciding whether to partition that core atom set into “RMSD‐stable domains” and, if so, how to determine partitioning of the core atom set. We demonstrate our algorithm and its application in calculating statistically sound RMSD values by applying it to a set of NMR‐derived structural ensembles, superimposing each RMSD‐stable domain (or the entire core atom set, where appropriate) found in each protein structure under consideration. A parameter calculated by our algorithm using a novel, kurtosis‐based criterion, the ϵ‐value, is a measure of precision of the superimposition that complements the RMSD. In addition, we compare our algorithm with previously described algorithms for determining core atom sets. The methods presented in this paper for biomolecular structure superimposition are quite general, and have application in many areas of structural bioinformatics and structural biology. Proteins 2005. © 2005 Wiley‐Liss, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here