Premium
Phylogenetic information improves homology detection
Author(s) -
Rehmsmeier Marc,
Vingron Martin
Publication year - 2001
Publication title -
proteins: structure, function, and bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.699
H-Index - 191
eISSN - 1097-0134
pISSN - 0887-3585
DOI - 10.1002/prot.1156
Subject(s) - phylogenetic tree , pairwise comparison , homology (biology) , hidden markov model , tree (set theory) , multiple sequence alignment , sequence database , sequence alignment , k d tree , biology , computer science , artificial intelligence , mathematics , combinatorics , genetics , algorithm , gene , tree traversal , peptide sequence
Abstract We present a database search method that is based on phylogenetic trees ( treesearch ). The method is used to search a protein sequence database for homologs to a protein family. In preparation for the search, a phylogenetic tree is constructed from a given multiple alignment of the family. During the search, each database sequence is temporarily inserted into the tree, thus adding a new edge to the tree. Homology between family and sequence is then judged from the length of this edge. In a comparison of our method to profiles (ISREC pfsearch), two implementations of hidden Markov models (HMMER hmmsearch and SAM hmmscore), and to the family pairwise search (FPS) method on 43 families from the SCOP database based on minimum false‐positive counts (min‐FPCs), we found a considerable gain in sensitivity. In 69% of the test cases, treesearch showed a min‐FPC of at most 50, whereas the two second best methods (hmmsearch and FPS) showed this performance only in 53% cases. A similar impression holds for a large range of min‐FPC thresholds. The results demonstrate that phylogenetic information can significantly improve the detection of distant homologies and justify our method as a useful alternative to existing methods. Proteins 2001;45:360–371. © 2001 Wiley‐Liss, Inc.