z-logo
open-access-imgOpen Access
Assessment of statistical methods used in library-based approaches to microbial source tracking
Author(s) -
Kerry J. Ritter,
Ethan A. Carruthers,
C. A. Carson,
R. D. Ellender,
Valerie J. Harwood,
Kyle Kingsley,
Cindy H. Nakatsu,
Michael J. Sadowsky,
Brian L. Shear,
Brian S. West,
John E. Whitlock,
Bruce A. Wiggins,
Jayson D. Wilbur
Publication year - 2003
Publication title -
journal of water and health
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.482
H-Index - 59
eISSN - 1996-7829
pISSN - 1477-8920
DOI - 10.2166/wh.2003.0022
Subject(s) - linear discriminant analysis , statistics , similarity (geometry) , identification (biology) , sample size determination , false positive paradox , matching (statistics) , sample (material) , pattern recognition (psychology) , artificial intelligence , computer science , mathematics , data mining , biology , ecology , chemistry , chromatography , image (mathematics)
Several commonly used statistical methods for fingerprint identification in microbial source tracking (MST) were examined to assess the effectiveness of pattern-matching algorithms to correctly identify sources. Although numerous statistical methods have been employed for source identification, no widespread consensus exists as to which is most appropriate. A large-scale comparison of several MST methods, using identical fecal sources, presented a unique opportunity to assess the utility of several popular statistical methods. These included discriminant analysis, nearest neighbour analysis, maximum similarity and average similarity, along with several measures of distance or similarity. Threshold criteria for excluding uncertain or poorly matched isolates from final analysis were also examined for their ability to reduce false positives and increase prediction success. Six independent libraries used in the study were constructed from indicator bacteria isolated from fecal materials of humans, seagulls, cows and dogs. Three of these libraries were constructed using the rep-PCR technique and three relied on antibiotic resistance analysis (ARA). Five of the libraries were constructed using Escherichia coli and one using Enterococcus spp. (ARA). Overall, the outcome of this study suggests a high degree of variability across statistical methods. Despite large differences in correct classification rates among the statistical methods, no single statistical approach emerged as superior. Thresholds failed to consistently increase rates of correct classification and improvement was often associated with substantial effective sample size reduction. Recommendations are provided to aid in selecting appropriate analyses for these types of data.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom