Inference of Genome Duplications from Age Distributions Revisited
Author(s) -
Kevin Vanneste,
Yves Van de Peer,
Steven Maere
Publication year - 2012
Publication title -
molecular biology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.637
H-Index - 218
eISSN - 1537-1719
pISSN - 0737-4038
DOI - 10.1093/molbev/mss214
Subject(s) - biology , evolutionary biology , inference , gene duplication , phylogenetics , genome , phylogenetic tree , genome evolution , genetics , gene , artificial intelligence , computer science
Whole-genome duplications (WGDs), thought to facilitate evolutionary innovations and adaptations, have been uncovered in many phylogenetic lineages. WGDs are frequently inferred from duplicate age distributions, where they manifest themselves as peaks against a small-scale duplication background. However, the interpretation of duplicate age distributions is complicated by the use of K(S), the number of synonymous substitutions per synonymous site, as a proxy for the age of paralogs. Two particular concerns are the stochastic nature of synonymous substitutions leading to increasing uncertainty in K(S) with increasing age since duplication and K(S) saturation caused by the inability of evolutionary models to fully correct for the occurrence of multiple substitutions at the same site. K(S) stochasticity is expected to erode the signal of older WGDs, whereas K(S) saturation may lead to artificial peaks in the distribution. Here, we investigate the consequences of these effects on K(S)-based age distributions and WGD inference by simulating the evolution of duplicated sequences according to predefined real age distributions and re-estimating the corresponding K(S) distributions. We show that, although K(S) estimates can be used for WGD inference far beyond the commonly accepted K(S) threshold of 1, K(S) saturation effects can cause artificial peaks at higher ages. Moreover, K(S) stochasticity and saturation may lead to confounded peaks encompassing multiple WGD events and/or saturation artifacts. We argue that K(S) effects need to be properly accounted for when inferring WGDs from age distributions and that the failure to do so could lead to false inferences.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom