z-logo
open-access-imgOpen Access
Model-based estimation of superinfection prevalence from limited datasets
Author(s) -
Daniel B. Reeves,
Amalia Magaret,
Alexander L. Greninger,
Christine Johnston,
Joshua T. Schiffer
Publication year - 2018
Publication title -
journal of the royal society interface
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.655
H-Index - 139
eISSN - 1742-5689
pISSN - 1742-5662
DOI - 10.1098/rsif.2017.0968
Subject(s) - estimation , superinfection , small area estimation , statistics , computer science , econometrics , biology , data mining , computational biology , mathematics , genetics , estimator , management , economics , virus
Humans can be infected sequentially by different strains of the same virus. Estimating the prevalence of so-called ‘superinfection’ for a particular pathogen is vital because superinfection implies a failure of immunologic memory against a given virus despite past exposure, which may signal challenges for future vaccine development. Increasingly, viral deep sequencing and phylogenetic inference can discriminate distinct strains within a host. Yet, a population-level study may misrepresent the true prevalence of superinfection for several reasons. First, certain infections such as herpes simplex virus (HSV-2) only reactivate single strains, making multiple samples necessary to detect superinfection. Second, the number of samples collected in a study may be fewer than the actual number of independently acquired strains within a single person. Third, detecting strains that are relatively less abundant can be difficult, even for other infections such as HIV-1 where deep sequencing may identify multiple strains simultaneously. Here we develop a model of superinfection inspired by ecology. We define an infected individual's richness as the number of infecting strains and use ecological evenness to quantify the relative strain abundances. The model uses an EM methodology to infer the true prevalence of superinfection from limited clinical datasets. Simulation studies with known true prevalence are used to contrast our EM method to a standard (naive) calculation. While varying richness, evenness and sampling we quantify the accuracy and precision of our method. The EM method outperforms in all cases, particularly when sampling is low, and richness or unevenness is high. Here, sensitivity to our assumptions about clinical data is considered. The simulation studies also provide insight into optimal study designs; estimates of prevalence improve equally by enrolling more participants or gathering more samples per person. Finally, we apply our method to data from published studies of HSV-2 and HIV-1 superinfection.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom