Model-based estimation of superinfection prevalence from limited datasets | Zendy

Daniel B. Reeves | Zendy; Amalia Magaret | Zendy; Alexander L. Greninger | Zendy; Christine Johnston | Zendy; Joshua T. Schiffer | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Model-based estimation of superinfection prevalence from limited datasets

Author(s) -

Daniel B. Reeves,

Amalia Magaret,

Alexander L. Greninger,

Christine Johnston,

Joshua T. Schiffer

Publication year - 2018

Publication title -

journal of the royal society interface

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.655

H-Index - 139

eISSN - 1742-5689

pISSN - 1742-5662

DOI - 10.1098/rsif.2017.0968

Subject(s) - estimation , superinfection , small area estimation , statistics , computer science , econometrics , biology , data mining , computational biology , mathematics , genetics , estimator , management , economics , virus

Humans can be infected sequentially by different strains of the same virus. Estimating the prevalence of so-called ‘superinfection’ for a particular pathogen is vital because superinfection implies a failure of immunologic memory against a given virus despite past exposure, which may signal challenges for future vaccine development. Increasingly, viral deep sequencing and phylogenetic inference can discriminate distinct strains within a host. Yet, a population-level study may misrepresent the true prevalence of superinfection for several reasons. First, certain infections such as herpes simplex virus (HSV-2) only reactivate single strains, making multiple samples necessary to detect superinfection. Second, the number of samples collected in a study may be fewer than the actual number of independently acquired strains within a single person. Third, detecting strains that are relatively less abundant can be difficult, even for other infections such as HIV-1 where deep sequencing may identify multiple strains simultaneously. Here we develop a model of superinfection inspired by ecology. We define an infected individual's richness as the number of infecting strains and use ecological evenness to quantify the relative strain abundances. The model uses an EM methodology to infer the true prevalence of superinfection from limited clinical datasets. Simulation studies with known true prevalence are used to contrast our EM method to a standard (naive) calculation. While varying richness, evenness and sampling we quantify the accuracy and precision of our method. The EM method outperforms in all cases, particularly when sampling is low, and richness or unevenness is high. Here, sensitivity to our assumptions about clinical data is considered. The simulation studies also provide insight into optimal study designs; estimates of prevalence improve equally by enrolling more participants or gathering more samples per person. Finally, we apply our method to data from published studies of HSV-2 and HIV-1 superinfection.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research