Efficient estimation of pairwise distances between genomes
Author(s) -
Mirjana Domazet-Lošo,
Bernhard Haubold
Publication year - 2009
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btp590
Subject(s) - genome , computer science , pairwise comparison , tree traversal , tree (set theory) , suffix tree , sequence (biology) , source code , algorithm , combinatorics , data structure , biology , mathematics , genetics , artificial intelligence , gene , programming language , operating system
Genome comparison is central to contemporary genomics and typically relies on sequence alignment. However, genome-wide alignments are difficult to compute. We have, therefore, recently developed an accurate alignment-free estimator of the number of substitutions per site based on the lengths of exact matches between pairs of sequences. The previous implementation of this measure requires n(n-1) suffix tree constructions and traversals, where n is the number of sequences analyzed. This does not scale well for large n.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom