
Protein Classification Based on Analysis of Local Sequence-Structure Correspondence
Author(s) -
A T Zemla
Publication year - 2006
Language(s) - English
Resource type - Reports
DOI - 10.2172/893991
Subject(s) - protein data bank (rcsb pdb) , protein data bank , structural classification of proteins database , protein structure , structural motif , cluster analysis , computer science , structural alignment , protein structure database , computational biology , data mining , set (abstract data type) , sequence (biology) , pattern recognition (psychology) , sequence alignment , artificial intelligence , sequence database , peptide sequence , biology , genetics , biochemistry , gene , programming language
The goal of this project was to develop an algorithm to detect and calculate common structural motifs in compared structures, and define a set of numerical criteria to be used for fully automated motif based protein structure classification. The Protein Data Bank (PDB) contains more than 33,000 experimentally solved protein structures, and the Structural Classification of Proteins (SCOP) database, a manual classification of these structures, cannot keep pace with the rapid growth of the PDB. In our approach called STRALCP (STRucture Alignment based Clustering of Proteins), we generate detailed information about global and local similarities between given set of structures, identify similar fragments that are conserved within analyzed proteins, and use these conserved regions (detected structural motifs) to classify proteins