z-logo
Premium
Another look at the conditions for the extraction of protein knowledge‐based potentials
Author(s) -
Betancourt Marcos R.
Publication year - 2009
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.22320
Subject(s) - pairwise comparison , maxima and minima , stability (learning theory) , lattice boltzmann methods , statistical physics , boltzmann constant , boltzmann distribution , protein folding , lattice (music) , computer science , physics , chemistry , mathematics , thermodynamics , artificial intelligence , machine learning , mathematical analysis , acoustics , nuclear magnetic resonance
Protein knowledge‐based potentials are effective free energies obtained from databases of known protein structures. They are used to parameterize coarse‐grained protein models in many folding simulation and structure prediction methods. Two common approaches are used in the derivation of knowledge‐based potentials. One assumes that the energy parameters optimize the native structure stability. The other assumes that interaction events are related to their energies according to the Boltzmann distribution, and that they are distributed independently of other events, that is, the quasi‐chemical approximation. Here, these assumptions are systematically tested by extracting contact energies from artificial databases of lattice proteins with predefined pairwise contact energies. Databases of protein sequences are designed to either satisfy the Boltzmann distribution at high or low temperatures, or to simultaneously optimize the native stability and folding kinetics. It is found that the quasi‐chemical approximation, with the ideal reference state, accurately reproduce the true energies for high temperature Boltzmann distributed sequences (weakly interacting residues), but less accurately at low temperatures, where the sequences correspond to energy minima and the residues are strongly interacting. To overcome this problem, an iterative procedure for Boltzmann distributed sequences is introduced, which accounts for interacting residue correlations and eliminates the need for the quasi‐chemical approximation. In this case, the energies are accurately reproduced at any ensemble temperature. However, when the database of sequences designed for optimal stability and kinetics is used, the energy correlation is less than optimal using either method, exhibiting random and systematic deviations from linearity. Therefore, the assumption that native structures are maximally stable or that sequences are determined according to the Boltzmann distribution seems to be inadequate for obtaining accurate energies. The limited number of sequences in the database and the inhomogeneous concentration of amino acids from one structure to another do not seem to be major obstacles for improving the quality of the extracted pairwise energies, with the exception of repulsive interactions. Proteins 2009. © 2008 Wiley‐Liss, Inc.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here