Premium
A novel approach to predict active sites of enzyme molecules
Author(s) -
Chou KuoChen,
Cai Yudong
Publication year - 2004
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.10622
Subject(s) - active site , protein data bank (rcsb pdb) , computer science , serine hydrolase , executable , protein data bank , computational biology , enzyme , chemistry , serine , data mining , protein structure , stereochemistry , biochemistry , biology , operating system
Enzymes are critical in many cellular signaling cascades. With many enzyme structures being solved, there is an increasing need to develop an automated method for identifying their active sites. However, given the atomic coordinates of an enzyme molecule, how can we predict its active site? This is a vitally important problem because the core of an enzyme molecule is its active site from the viewpoints of both pure scientific research and industrial application. In this article, a topological entity was introduced to characterize the enzymatic active site. Based on such a concept, the covariant discriminant algorithm was formulated for identifying the active site. As a paradigm, the serine hydrolase family was demonstrated. The overall success rate by jackknife test for a data set of 88 enzyme molecules was 99.92%, and that for a data set of 50 independent enzyme molecules was 99.91%. Meanwhile, it was shown through an example that the prediction algorithm can also be used to find any typographic error of a PDB file in annotating the constituent amino acids of catalytic triad and to suggest a possible correction. The very high success rates are due to the introduction of a covariance matrix in the prediction algorithm that makes allowance for taking into account the coupling effects among the key constituent atoms of active site. It is anticipated that the novel approach is quite promising and may become a useful high throughput tool in enzymology, proteomics, and structural bioinformatics. Proteins 2004. © 2004 Wiley‐Liss, Inc.