Premium
Using distances between α‐carbons to predict protein structure
Author(s) -
Crecca Christina R.,
Roitberg Adrian E.
Publication year - 2008
Publication title -
international journal of quantum chemistry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.484
H-Index - 105
eISSN - 1097-461X
pISSN - 0020-7608
DOI - 10.1002/qua.21769
Subject(s) - decoy , protein structure , protein structure prediction , algorithm , protein data bank , biological system , residue (chemistry) , set (abstract data type) , computer science , chemistry , biology , biochemistry , receptor , programming language
Knowledge of a protein's structure is important in understanding its function. The usual experimental structure determination methods can be costly and time‐consuming. We present an idea for a fast and inexpensive protein structure prediction method that combines modeling with less expensive experimental data. Our method involves three steps: (1) building a decoy set, (2) measuring inter‐residue distances in a target protein, and (3) comparing the measured distances with those calculated in each decoy. We postulate that structures with a small number of similar inter‐residue distances will also have similar three‐dimensional structure. We further hypothesize that the minimum number of distances needed to determine structure is much less than the total number of inter‐residue distances in the protein. To develop our protocol, we apply our method to target proteins whose structures have been solved experimentally but have not been included in the set. We simulate experimental data by calculating α‐carbon distances from the experimentally determined structures of our target proteins. We have created a large, generalized decoy set using most of the structures in the Protein Data Bank. It can be used to study any protein composed of 100 residues or less. Using this decoy set, we searched for four proteins; our predicted structures ranged in RMSD from 3.6 to 7.7 Å. We have also analyzed the RMSD distributions of the decoys using the search proteins as references and found the distributions to be similar for each protein. Of the nearly 5,000 C α C α distances in a 100 residue protein, knowledge of only twenty‐five distances will usually result in predicting a reliable model. © 2008 Wiley Periodicals, Inc. Int J Quantum Chem, 2008