z-logo
open-access-imgOpen Access
A Supervised Statistical Learning Approach for Accurate Legionella pneumophila Source Attribution during Outbreaks
Author(s) -
Andrew H. Buultjens,
Kyra Chua,
Sarah L. Baines,
Jason C. Kwong,
Wei Gao,
Zoe Cutcher,
Stuart Adcock,
Susan A. Ballard,
Mark B. Schultz,
Takehiro Tomita,
Nela Subasinghe,
Glen P. Carter,
Sacha J. Pidot,
Lucinda Franklin,
Torsten Seemann,
Anders Gonçalves da Silva,
Benjamin P. Howden,
Timothy P. Stinear
Publication year - 2017
Publication title -
applied and environmental microbiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.552
H-Index - 324
eISSN - 1070-6291
pISSN - 0099-2240
DOI - 10.1128/aem.01482-17
Subject(s) - outbreak , legionella pneumophila , legionnaires' disease , biology , legionella , single nucleotide polymorphism , computational biology , genetics , virology , genotype , bacteria , gene
Public health agencies are increasingly relying on genomics during Legionnaires' disease investigations. However, the causative bacterium (Legionella pneumophila ) has an unusual population structure, with extreme temporal and spatial genome sequence conservation. Furthermore, Legionnaires' disease outbreaks can be caused by multipleL. pneumophila genotypes in a single source. These factors can confound cluster identification using standard phylogenomic methods. Here, we show that a statistical learning approach based onL. pneumophila core genome single nucleotide polymorphism (SNP) comparisons eliminates ambiguity for defining outbreak clusters and accurately predicts exposure sources for clinical cases. We illustrate the performance of our method by genome comparisons of 234L. pneumophila isolates obtained from patients and cooling towers in Melbourne, Australia, between 1994 and 2014. This collection included one of the largest reported Legionnaires' disease outbreaks, which involved 125 cases at an aquarium. Using only sequence data fromL. pneumophila cooling tower isolates and including all core genome variation, we built a multivariate model using discriminant analysis of principal components (DAPC) to find cooling tower-specific genomic signatures and then used it to predict the origin of clinical isolates. Model assignments were 93% congruent with epidemiological data, including the aquarium Legionnaires' disease outbreak and three other unrelated outbreak investigations. We applied the same approach to a recently described investigation of Legionnaires' disease within a UK hospital and observed a model predictive ability of 86%. We have developed a promising means to breachL. pneumophila genetic diversity extremes and provide objective source attribution data for outbreak investigations.IMPORTANCE Microbial outbreak investigations are moving to a paradigm where whole-genome sequencing and phylogenetic trees are used to support epidemiological investigations. It is critical that outbreak source predictions are accurate, particularly for pathogens, likeLegionella pneumophila , which can spread widely and rapidly via cooling system aerosols, causing Legionnaires' disease. Here, by studying hundreds ofLegionella pneumophila genomes collected over 21 years around a major Australian city, we uncovered limitations with the phylogenetic approach that could lead to a misidentification of outbreak sources. We implement instead a statistical learning technique that eliminates the ambiguity of inferring disease transmission from phylogenies. Our approach takes geolocation information and core genome variation from environmentalL. pneumophila isolates to build statistical models that predict with high confidence the environmental source of clinicalL. pneumophila during disease outbreaks. We show the versatility of the technique by applying it to unrelated Legionnaires' disease outbreaks in Australia and the UK.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom