Premium
Molecular modeling of protein function regions
Author(s) -
DeWeeseScott Carol,
Moult John
Publication year - 2004
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.10519
Subject(s) - casp , sequence (biology) , protein structure database , protein design , computational biology , protein structure prediction , function (biology) , basis (linear algebra) , protein structure , biological system , computer science , molecular model , algorithm , chemistry , biology , mathematics , genetics , gene , stereochemistry , biochemistry , geometry , sequence database
Experimental protein structures often provide extensive insight into the mode and specificity of small molecule binding, and this information is useful for understanding protein function and for the design of drugs. We have performed an analysis of the reliability with which ligand‐binding information can be deduced from computer model structures, as opposed to experimentally derived ones. Models produced as part of the CASP experiments are used. The accuracy of contacts between protein model atoms and experimentally determined ligand atom positions is the main criterion. Only comparative models are included (i.e., models based on a sequence relationship between the protein of interest and a known structure). We find that, as expected, contact errors increase with decreasing sequence identity used as a basis for modeling. Analysis of the causes of errors shows that sequence alignment errors between model and experimental template have the most deleterious effect. In general, good, but not perfect, insight into ligand binding can be obtained from models based on a sequence relationship, providing there are no alignment errors in the model. The results support a structural genomics strategy based on experimental sampling of structure space so that all protein domains can be modeled on the basis of 30% or higher sequence identity. Proteins 2004. © 2004 Wiley‐Liss, Inc.