z-logo
Premium
An investigation of error sources and their impact in estimating the time to the most recent ancestor of spatially and temporally distributed HIV sequences
Author(s) -
Burr Tom L.,
Gattiker James R.,
Gerrish Philip J.
Publication year - 2003
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.1508
Subject(s) - most recent common ancestor , confidence interval , range (aeronautics) , statistics , sequence (biology) , population , econometrics , coverage probability , computer science , mathematics , biology , phylogenetic tree , demography , genetics , materials science , gene , composite material , sociology
This is an investigation of significant error sources and their impact in estimating the time to the most recent common ancestor (MRCA) of spatially and temporally distributed human immunodeficiency virus (HIV) sequences. We simulate an HIV epidemic under a range of assumptions with known time to the MRCA (tMRCA). We then apply a range of baseline (known) evolutionary models to generate sequence data. We next estimate or assume one of several misspecified models and use the chosen model to estimate the time to the MRCA. Random effects and the extent of model misspecification determine the magnitude of error sources that could include: neglected heterogeneity in substitution rates across lineages and DNA sites; uncertainty in HIV isolation times; uncertain magnitude and type of population subdivision; uncertain impacts of host/viral transmission dynamics, and unavoidable model estimation errors. Our results suggest that confidence intervals will rarely have the nominal coverage probability for tMRCA. Neglected effects lead to errors that are unaccounted for in most analyses, resulting in optimistically narrow confidence intervals (CI). Using real HIV sequences having approximately known isolation times and locations, we present possible confidence intervals for several sets of assumptions. In general, we cannot be certain how much to broaden a stated confidence interval for tMRCA. However, we describe the impact of candidate error sources on CI width. We also determine which error sources have the most impact on CI width and demonstrate that the standard bootstrap method will underestimate the CI width. Copyright © 2003 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here