
Stochastic modelling and inference in electronic hospital databases for the spread of infections: Clostridium difficile transmission in Oxfordshire hospitals 2007–2010
Author(s) -
Madeleine Cule,
Peter Donnelly
Publication year - 2017
Publication title -
annals of applied statistics/the annals of applied statistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.674
H-Index - 75
eISSN - 1941-7330
pISSN - 1932-6157
DOI - 10.1214/16-aoas1011
Subject(s) - inference , clostridium difficile , transmission (telecommunications) , markov chain monte carlo , bayesian inference , medicine , epidemiology , computer science , data science , data mining , bayesian probability , biology , artificial intelligence , genetics , telecommunications , antibiotics
The combination of genetic information with electronic patient records promises to provide a powerful new resource for understanding human disease and its treatment. Here we develop and apply a novel stochastic compartmental model to a large dataset on Clostridium difficile infection (CDI) in three Oxfordshire hospitals over a 2.5 year period which combines genetic information on 858 confirmed cases of CDI with a database of 750,000 patient records. C. difficile is a major cause of healthcare-associated diarrhoea and is responsible for substantial mortality and morbidity, with relatively little known about its biology or its transmission epidemiology. Bayesian analysis of our model, via Markov chain Monte Carlo, provides new information about the biology of CDI, including genetic heterogeneity in infectiousness across different sequence types, and evidence for ward contamination as a significant mode of transmission, and allows inferences about the contribution of particular individuals, wards, or hospitals to transmission of the bacterium, and assessment of changes in these over time following changes in hospital practice. Our work demonstrates the value of using statistical modelling and computational inference on large-scale hospital patient databases and genetic data.