z-logo
open-access-imgOpen Access
DIVCLUS: an automatic method in the GEANFAMMER package that finds homologous domains in single- and multi-domain proteins.
Author(s) -
JongEun Park,
Sarah A. Teichmann
Publication year - 1998
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/14.2.144
Subject(s) - gene duplication , sequence (biology) , genome , computational biology , genetics , structural classification of proteins database , simple (philosophy) , biology , segmental duplication , computer science , gene , protein structure , gene family , biochemistry , philosophy , epistemology
Large-scale determination of relationships between the proteins produced by genome sequences is now common. All protein sequences are matched and those that have high match scores are clustered into families. In cases where the proteins are built of several domains or duplication modules, this can lead to misleading results. Consider the very simple example of three proteins: 1, formed by duplication modules A and B; 2, formed by duplication modules B' and C; and 3, formed by duplication modules C' and D. Duplication modules B and B' are homologous, as are C and C'. Matching the sequences of 1, 2 and 3 followed by simple single-linkage clustering would put all three in the same family, even though proteins 1 and 3 are not related. This is because the different parts of 2 match 1 and 3. This paper describes a procedure, DIVCLUS, that divides such complex clusters of partially related sequences into simple clusters that contain only related duplication modules. In the example just given, it would produce two groups of sequences: the first with domains B of sequence 1 and B of sequence 2, and the second with domain C of sequence 2 and C of sequence 3. DIVCLUS is part of a package called GEANFAMMER, for GEnome ANalysis and protein FAMily MakER. The package automates the detection of families of duplication modules from a protein sequence database.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom