Simulating DNA Coding Sequence Evolution with EvolveAGene 3
Author(s) -
B G Hall
Publication year - 2008
Publication title -
molecular biology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.637
H-Index - 218
eISSN - 1537-1719
pISSN - 0737-4038
DOI - 10.1093/molbev/msn008
Subject(s) - transversion , indel , biology , indel mutation , sequence (biology) , phylogenetic tree , genetics , coding region , molecular evolution , selection (genetic algorithm) , sequence analysis , substitution (logic) , computational biology , mutation , computer science , dna , gene , artificial intelligence , genotype , single nucleotide polymorphism , programming language
Phylogenetic reconstruction based upon multiple alignments of molecular sequences is important to most branches of modern biology and is central to molecular evolution. Understanding the historical relationships among macromolecules depends upon computer programs that implement a variety of analytical methods. Because it is impossible to know those historical relationships with certainty, assessment of the accuracy of methods and the programs that implement them requires the use of programs that realistically simulate the evolution of DNA sequences. EvolveAGene 3 is a realistic coding sequence simulation program that separates mutation from selection and allows the user to set selection conditions, including variable regions of selection intensity within the sequence and variation in intensity of selection over branches. Variation includes base substitutions, insertions, and deletions. To the best of my knowledge, it is the only program available that simulates the evolution of intact coding sequences. Output includes the true tree and true alignments of the resulting coding sequence and corresponding protein sequences. A log file reports the frequencies of each kind of base substitution, the ratio of transition to transversion substitutions, the ratio of indel to base substitution mutations, and the numbers of silent and amino acid replacement mutations. The realism of the data sets has been assessed by comparing the d(N)/d(S) ratio, the ratio of transition to transversion substitutions, and the ratio of indel to base substitution mutations of the simulated data sets with those parameters of real data sets from the "gold standard" BaliBase collection of structural alignments. Results show that the data sets produced by EvolveAGene 3 are very similar to real data sets, and EvolveAGene 3 is therefore a realistic simulation program that can be used to evaluate a variety of programs and methods in molecular evolution.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom