z-logo
Premium
The CATH extended protein‐family database: Providing structural annotations for genome sequences
Author(s) -
Pearl Frances M.G.,
Lee David,
Bray James E.,
Buchan Daniel W.A.,
Shepherd Adrian J.,
Orengo Christine A.
Publication year - 2002
Publication title -
protein science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.353
H-Index - 175
eISSN - 1469-896X
pISSN - 0961-8368
DOI - 10.1110/ps.16802
Subject(s) - genbank , sequence database , biology , database , sequence alignment , homology (biology) , sequence (biology) , genetics , oracle , protein family , sequence analysis , gene , computational biology , computer science , peptide sequence , programming language
An automatic sequence search and analysis protocol (DomainFinder) based on PSI-BLAST and IMPALA, and using conservative thresholds, has been developed for reliably integrating gene sequences from GenBank into their respective structural families within the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath_new). DomainFinder assigns a new gene sequence to a CATH homologous superfamily provided that PSI-BLAST identifies a clear relationship to at least one other Protein Data Bank sequence within that superfamily. This has resulted in an expansion of the CATH protein family database (CATH-PFDB v1.6) from 19,563 domain structures to 176,597 domain sequences. A further 50,000 putative homologous relationships can be identified using less stringent cut-offs and these relationships are maintained within neighbour tables in the CATH Oracle database, pending further evidence of their suggested evolutionary relationship. Analysis of the CATH-PFDB has shown that only 15% of the sequence families are close enough to a known structure for reliable homology modeling. IMPALA/PSI-BLAST profiles have been generated for each of the sequence families in the expanded CATH-PFDB and a web server has been provided so that new sequences may be scanned against the profile library and be assigned to a structure and homologous superfamily.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here