Biased phylodynamic inferences from analysing clusters of viral sequences
Author(s) -
Bethany L. Dearlove,
Fei Xiang,
Simon D. W. Frost
Publication year - 2017
Publication title -
virus evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.231
H-Index - 23
ISSN - 2057-1577
DOI - 10.1093/ve/vex020
Subject(s) - coalescent theory , viral phylodynamics , cluster (spacecraft) , population , population size , exponential growth , phylogenetic tree , cluster analysis , transmission (telecommunications) , biology , logistic function , statistics , evolutionary biology , mathematics , demography , computer science , genetics , mathematical analysis , telecommunications , sociology , gene , programming language
Phylogenetic methods are being increasingly used to help understand the transmission dynamics of measurably evolving viruses, including HIV. Clusters of highly similar sequences are often observed, which appear to follow a ‘power law’ behaviour, with a small number of very large clusters. These clusters may help to identify subpopulations in an epidemic, and inform where intervention strategies should be implemented. However, clustering of samples does not necessarily imply the presence of a subpopulation with high transmission rates, as groups of closely related viruses can also occur due to non-epidemiological effects such as over-sampling. It is important to ensure that observed phylogenetic clustering reflects true heterogeneity in the transmitting population, and is not being driven by non-epidemiological effects. We qualify the effect of using a falsely identified ‘transmission cluster’ of sequences to estimate phylodynamic parameters including the effective population size and exponential growth rate under several demographic scenarios. Our simulation studies show that taking the maximum size cluster to re-estimate parameters from trees simulated under a randomly mixing, constant population size coalescent process systematically underestimates the overall effective population size. In addition, the transmission cluster wrongly resembles an exponential or logistic growth model 99% of the time. We also illustrate the consequences of false clusters in exponentially growing coalescent and birth-death trees, where again, the growth rate is skewed upwards. This has clear implications for identifying clusters in large viral databases, where a false cluster could result in wasted intervention resources.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom