Premium
The CATH extended protein‐family database: Providing structural annotations for genome sequences
Author(s) -
Pearl Frances M.G.,
Lee David,
Bray James E.,
Buchan Daniel W.A.,
Shepherd Adrian J.,
Orengo Christine A.
Publication year - 2002
Publication title -
protein science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.353
H-Index - 175
eISSN - 1469-896X
pISSN - 0961-8368
DOI - 10.1110/ps.16802
Subject(s) - genbank , sequence database , biology , database , sequence alignment , homology (biology) , sequence (biology) , genetics , oracle , protein family , sequence analysis , gene , computational biology , computer science , peptide sequence , programming language
An automatic sequence search and analysis protocol (DomainFinder) based on PSI-BLAST and IMPALA, and using conservative thresholds, has been developed for reliably integrating gene sequences from GenBank into their respective structural families within the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath_new). DomainFinder assigns a new gene sequence to a CATH homologous superfamily provided that PSI-BLAST identifies a clear relationship to at least one other Protein Data Bank sequence within that superfamily. This has resulted in an expansion of the CATH protein family database (CATH-PFDB v1.6) from 19,563 domain structures to 176,597 domain sequences. A further 50,000 putative homologous relationships can be identified using less stringent cut-offs and these relationships are maintained within neighbour tables in the CATH Oracle database, pending further evidence of their suggested evolutionary relationship. Analysis of the CATH-PFDB has shown that only 15% of the sequence families are close enough to a known structure for reliable homology modeling. IMPALA/PSI-BLAST profiles have been generated for each of the sequence families in the expanded CATH-PFDB and a web server has been provided so that new sequences may be scanned against the profile library and be assigned to a structure and homologous superfamily.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom