Premium
The effect of genetic structure on molecular dating and tests for temporal signal
Author(s) -
Murray Gemma G. R.,
Wang Fang,
Harrison Ewan M.,
Paterson Gavin K.,
Mather Alison E.,
Harris Simon R.,
Holmes Mark A.,
Rambaut Andrew,
Welch John J.
Publication year - 2016
Publication title -
methods in ecology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.425
H-Index - 105
ISSN - 2041-210X
DOI - 10.1111/2041-210x.12466
Subject(s) - permutation (music) , evolutionary biology , biology , resampling , genetic data , ancient dna , computational biology , computer science , statistics , data mining , artificial intelligence , mathematics , medicine , population , physics , environmental health , acoustics
Summary‘Dated‐tip’ methods of molecular dating use DNA sequences sampled at different times, to estimate the age of their most recent common ancestor. Several tests of ‘temporal signal’ are available to determine whether data sets are suitable for such analysis. However, it remains unclear whether these tests are reliable. We investigate the performance of several tests of temporal signal, including some recently suggested modifications. We use simulated data (where the true evolutionary history is known), and whole genomes of methicillin‐resistant S taphylococcus aureus (to show how particular problems arise with real‐world data sets). We show that all of the standard tests of temporal signal are seriously misleading for data where temporal and genetic structures are confounded (i.e. where closely related sequences are more likely to have been sampled at similar times). This is not an artefact of genetic structure or tree shape per se , and can arise even when sequences have measurably evolved during the sampling period. More positively, we show that a ‘clustered permutation’ approach introduced by Duchêne et al . ( Molecular Biology and Evolution , 32 , 2015, 1895) can successfully correct for this artefact in all cases and introduce techniques for implementing this method with real data sets. The confounding of temporal and genetic structures may be difficult to avoid in practice, particularly for outbreaks of infectious disease, or when using ancient DNA . Therefore, we recommend the use of ‘clustered permutation’ for all analyses. The failure of the standard tests may explain why different methods of dating pathogen origins have reached such wildly different conclusions.