Premium
Prediction of 3D metal binding sites from translated gene sequences based on remote‐homology templates
Author(s) -
Levy Ronen,
Edelman Marvin,
Sobolev Vladimir
Publication year - 2009
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.22352
Subject(s) - template , homology (biology) , computational biology , gene , biology , genetics , computer science , programming language
Database‐scale analysis was performed to determine whether structural models, based on remote homologues, are effective in predicting 3D transition metal binding sites in proteins directly from translated gene sequences. The extent by which side chain modeling alone reduces sensitivity and selectivity is shown to be <10%. Surprisingly, selectivity was not dependent on the level of sequence homology between template and target, or on the presence of a metal ion in the structural template. Applying a modification of the CHED algorithm (Babor et al ., Proteins 2008;70:208–217) and machine learning filters, a selectivity of ∼90% was achieved for protein sequences using unrelated structural templates over a sequence identity range of 18–100%. Below ∼18% identity, the number of analyzable target‐template pairs and predictability of metal binding sites falls off sharply. A full third of structural templates were found to have target partners only in the remote homology range of 18–30%. In this range, nonmetal‐binding templates are calculated to be the majority and serve to predict with 50% sensitivity at the geometric level. Overall, sensitivity at the geometric level for targets having templates in the 18–30% sequence identity range is 73%, with an average of one false positive site per true site. Protein sequences described as “unknown” in the UniProt database and composed largely of unidentified genome project sequences were studied and metal binding sites predicted. A web server for prediction of metal binding sites from protein sequence is provided. Proteins 2009. © 2008 Wiley‐Liss, Inc.