Open Access
Evaluation of whole-genome sequencing-based subtyping methods for the surveillance of Shigella spp. and the confounding effect of mobile genetic elements in long-term outbreaks
Author(s) -
Isabelle Bernaquez,
Christiane Gaudreau,
Pierre A. Pilon,
Sadjia Békal
Publication year - 2021
Publication title -
microbial genomics
Language(s) - English
Resource type - Journals
ISSN - 2057-5858
DOI - 10.1099/mgen.0.000672
Subject(s) - subtyping , outbreak , multilocus sequence typing , shigella sonnei , biology , whole genome sequencing , genetics , shigellosis , typing , shigella , genome , virology , genotype , salmonella , gene , computer science , bacteria , programming language
Many public health laboratories across the world have implemented whole-genome sequencing (WGS) for the surveillance and outbreak detection of foodborne pathogens. PulseNet-affiliated laboratories have determined that most single-strain foodborne outbreaks are contained within 0–10 multi-locus sequence typing (MLST)-based allele differences and/or core genome single-nucleotide variants (SNVs). In addition to being a food- and travel-associated outbreak pathogen, most Shigella spp. cases occur through continuous person-to-person transmission, predominantly involving men who have sex with men (MSM), leading to long-term and recurrent outbreaks. Continuous transmission patterns coupled to genetic evolution under antibiotic treatment pressure require an assessment of existing WGS-based subtyping methods and interpretation criteria for cluster inclusion/exclusion. An evaluation of 4 WGS-based subtyping methods [SNVPhyl, coreMLST, core genome MLST (cgMLST) and whole-genome MLST (wgMLST)] was performed on 9 foodborne-, travel- and MSM-related retrospective outbreaks from a collection of 91 Shigella flexneri and 232 Shigella sonnei isolates to determine the methods’ epidemiological concordance, discriminatory power, robustness and ability to generate stable interpretation criteria. The discriminatory powers were ranked as follows: coreMLST<SNVPhyl<cgMLST<wgMLST (range: 0.970–1.000). The genetic differences observed for non-MSM-related Shigella spp. outbreaks respect the standard 0–10 allele/SNV guideline; however, mobile genetic element (MGE)-encoded loci caused inflated genetic variation and discrepant phylogenies for prolonged MSM-related S. sonnei outbreaks via wgMLST. The S. sonnei correlation coefficients of wgMLST were also the lowest at 0.680, 0.703 and 0.712 for SNVPhyl, coreMLST and cgMLST, respectively. Plasmid maintenance, mobilization and conjugation-associated genes were found to be the main source of genetic distance inflation in addition to prophage-related genes. Duplicated alleles arising from the repeated nature of IS elements were also responsible for many false cg/wgMLST differences. The coreMLST approach was shown to be the most robust, followed by SNVPhyl and wgMLST for inter-laboratory comparability. Our results highlight the need for validating species-specific subtyping methods based on microbial genome plasticity and outbreak dynamics in addition to the importance of filtering confounding MGEs for cluster detection.