Effective cluster-based seed design for cross-species sequence comparisons
Author(s) -
Leming Zhou,
Ingrid Mihai,
Liliana Florea
Publication year - 2008
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btn547
Subject(s) - pairwise comparison , closeness , genome , cluster (spacecraft) , sequence (biology) , source code , biology , computational biology , computer science , similarity (geometry) , data mining , gene , genetics , artificial intelligence , mathematics , mathematical analysis , programming language , image (mathematics) , operating system
To annotate newly sequenced organisms, cross-species sequence comparison algorithms can be applied to align gene sequences to the genome of a related species. To improve the accuracy of alignment, spaced seeds must be optimized for each comparison. As the number and diversity of genomes increase, an efficient alternative is to cluster pairwise comparisons into groups and identify seeds for groups instead of individual comparisons. Here we investigate a measure of comparison closeness and identify classes of comparisons that show similar seed behavior and therefore can employ the same seed.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom