z-logo
open-access-imgOpen Access
Performance and Accuracy of Four Open-Source Tools for In Silico Serotyping of Salmonella spp. Based on Whole-Genome Short-Read Sequencing Data
Author(s) -
Laura Uelze,
Maria Blettner,
Carlus Deneke,
István Szabó,
Jennie Fischer,
Simon H. Tausch,
Burkhard Malorny
Publication year - 2020
Publication title -
applied and environmental microbiology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.552
H-Index - 324
eISSN - 1070-6291
pISSN - 0099-2240
DOI - 10.1128/aem.02265-19
Subject(s) - in silico , serotype , salmonella , open source , whole genome sequencing , computational biology , genome , biology , dna sequencing , genetics , microbiology and biotechnology , computer science , dna , bacteria , gene , software , programming language
We compared the performance of four open-source in silico Salmonella typing tools (SeqSero, SeqSero2, Salmonella In Silico Typing Resource [SISTR], and Metric Oriented Sequence Typer [MOST]) to assess their potential for replacing laboratory serological testing with serovar predictions from whole-genome sequencing data. We conducted a retrospective analysis of 1,624 Salmonella isolates of 72 serovars submitted to the German National Salmonella Reference Laboratory between 1999 and 2019. All isolates are derived from animal and foodstuff origins. We conducted Illumina short-read sequencing and compared the in silico serovar prediction results with the results of routine laboratory serotyping. We found the best-performing in silico serovar prediction tool to be SISTR, with 94% correctly typed isolates, followed by SeqSero2 (87%), SeqSero (81%), and MOST (79%). Furthermore, we found that mapping-based tools like SeqSero and SeqSero2 (allele mode) were more reliable for the prediction of monophasic variants, while sequence type and cluster-based methods like MOST and SISTR (core-genome multilocus sequence type [cgMLST]), showed greater resilience when confronted with GC-biased sequencing data. We showed that the choice of library preparation kit could substantially affect O antigen detection, due to the low GC content of the wzx and wzy genes. Although the accuracy of computational serovar predictions is still not quite on par with traditional serotyping by Salmonella reference laboratories, the command-line tools investigated in this study perform a rapid, efficient, inexpensive, and reproducible analysis, which can be integrated into in-house characterization pipelines. Based on our results, we find SISTR most suitable for automated, routine serotyping for public health surveillance of Salmonella IMPORTANCE Salmonella spp. are important foodborne pathogens. To reduce the number of infected patients, it is essential to understand which subtypes of the bacteria cause disease outbreaks. Traditionally, characterization of Salmonella requires serological testing, a laboratory method by which Salmonella isolates can be classified into over 2,600 distinct subtypes, called serovars. Due to recent advances in whole-genome sequencing, many tools have been developed to replace traditional testing methods with computational analysis of genome sequences. It is crucial to validate that these tools, many already in use for routine surveillance, deliver accurate and reliable serovar information. In this study, we set out to compare which of the currently available open-source command-line tools is most suitable to replace serological testing. A thorough evaluation of the differing computational approaches is highly important to ensure the backward compatibility of serotyping data and to maintain comparability between laboratories.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here