Recombination-filtered genomic datasets by information maximization
Author(s) -
August E. Woerner,
Murray P. Cox,
Michael F. Hammer
Publication year - 2007
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btm253
Subject(s) - coalescent theory , recombination , computer science , sequence (biology) , r package , tree (set theory) , biology , computational biology , genetics , mathematics , phylogenetics , gene , combinatorics , computational science
With the increasing amount of DNA sequence data available from natural populations, new computational methods are needed to efficiently process raw sequences into formats that are applicable to a variety of analytical methods. One highly successful approach to inferring aspects of demographic history is grounded in coalescent theory. Many of these methods restrict themselves to perfectly tree-like genealogies (i.e. regions with no observed recombination), because theoretical difficulties prevent ready statistical evaluation of recombining regions. However, determining which recombination-filtered dataset to analyze from a larger recombination-rich genomic region is a non-trivial problem. Current applications primarily aim to quantify recombination rates (rather than produce optimal recombination-filtered blocks), require significant manual intervention, and are impractical for multiple genomic datasets in high-throughput, automated research environments. Here, we present a fast, simple and automatable command-line program that extracts optimal recombination-filtered blocks (no four-gamete violations) from recombination-rich genomic re-sequence data.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom