Premium
Fast and reliable reconstruction of phylogenetic trees with indistinguishable edges
Author(s) -
Gronau Ilan,
Moran Shlomo,
Snir Sagi
Publication year - 2012
Publication title -
random structures and algorithms
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.314
H-Index - 69
eISSN - 1098-2418
pISSN - 1042-9832
DOI - 10.1002/rsa.20372
Subject(s) - tree (set theory) , algorithm , mathematics , set (abstract data type) , sequence (biology) , phylogenetic tree , upper and lower bounds , convergence (economics) , reconstruction algorithm , iterative reconstruction , computer science , combinatorics , artificial intelligence , biology , mathematical analysis , biochemistry , gene , economics , genetics , programming language , economic growth
Abstract Phylogenetic reconstruction methods attempt to reconstruct a tree describing the evolution of a given set of species using sequences of characters (e.g. DNA) extracted from these species as input. A central goal in this area is to design algorithms which guarantee reliable reconstruction of the tree from short input sequences, assuming common stochastic models of evolution. The fast converging reconstruction algorithms introduced in the last decade dramatically reduced the sequence length required to guarantee accurate reconstruction of the entire tree. However, if the tree in question contains even few edges which cannot be reliably reconstructed from the input sequences, then known fast converging algorithms may fail to reliably reconstruct all or most of the other edges. This calls for an adaptive approach suggested in this paper, called adaptive fast convergence, in which the set of edges which can be reliably reconstructed gradually increases with the amount of information (length of input sequences) available to the algorithm. This paper presents an adaptive fast converging algorithm which returns a partially resolved topology containing no false edges: edges that cannot be reliably reconstructed are contracted into high degree vertices. We also present an upper bound on the weights of those contracted edges, which is determined by the length of input sequences and the depth of the tree. As such, the reconstruction guarantee provided by our algorithm for individual edges is significantly stronger than any previously published edge reconstruction guarantee. This fact, together with the optimal complexity of our algorithm (linear space and quadratic‐time), makes it appealing for practical use. © 2011 Wiley Periodicals, Inc. Random Struct. Alg., 40, 350–384, 2011