Premium
Achieving Congruency of Phylogenetic Trees Generated by W‐Curves of Genomic Sequences
Author(s) -
CORK DOUGLAS J.,
HUTCH THOMAS B.,
MARLAND ELIZABETH,
ZMUDA JORY
Publication year - 2002
Publication title -
annals of the new york academy of sciences
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.712
H-Index - 248
eISSN - 1749-6632
pISSN - 0077-8923
DOI - 10.1111/j.1749-6632.2002.tb04886.x
Subject(s) - phylogenetic tree , genome , computational biology , visualization , biology , a priori and a posteriori , phylogenetics , gene duplication , genomic dna , computer science , genetics , data mining , dna , gene , philosophy , epistemology
A bstract : Comparative genomic analysis at its most fundamental level involves alignment and analysis of linear strings of DNA. Many useful and powerful tools, such as BlastN and ClustalW are able to respectively, search for, and align similar strings of DNA from a variety of species. However, interesting genomic patterns cannot be immediately visualized within the information contact embedded in long genomic strings without extensive a priori knowledge. More problematic is the question of whether we will be able to crystallize long genomic sequences and analyze their true secondary and tertiary structures. It is, of course, these putative motifs that are binding to the three‐dimensional structures of proteins and inducing replication and transcription events. The W‐curve is a numerical mapping algorithm that allows one to geometrically visualize the information content of genomic motifs. Patterns of ALU, LINES, SINEs, and duplication sequences may be easily visualized with the W‐curve. It is our hope that this pattern recognition algorithm will lead to visualization tools to track the evolutionary history of motif patterns. The combinatorics of DNA motif crossover‐recombination events will be more easily followed as we continue to sequence more and more genomes. In our laboratory we are currently collaborating with mathematicians and computer scientists to develop and test tools, such as the W‐curve, for analyzing patterns of long genomic sequences. In this paper, we examine the limitations of using the W‐curve to infer the phylogenetic history of species.