A machine‐learning approach to detecting unknown bacterial serovars | Zendy

Akova Ferit | Zendy; Dundar Murat | Zendy; Davisson V. Jo | Zendy; Hirleman E. Daniel | Zendy; Bhunia Arun K. | Zendy; Robinson J. Paul | Zendy; Rajwa Bartek | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

A machine‐learning approach to detecting unknown bacterial serovars

Author(s) -

Akova Ferit,

Dundar Murat,

Davisson V. Jo,

Hirleman E. Daniel,

Bhunia Arun K.,

Robinson J. Paul,

Rajwa Bartek

Publication year - 2010

Publication title -

statistical analysis and data mining: the asa data science journal

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.381

H-Index - 33

eISSN - 1932-1872

pISSN - 1932-1864

DOI - 10.1002/sam.10085

Subject(s) - artificial intelligence , machine learning , computer science , classifier (uml) , prior probability , support vector machine , supervised learning , naive bayes classifier , bayesian probability , pattern recognition (psychology) , artificial neural network

Technologies for rapid detection of bacterial pathogens are crucial for securing the food supply. A light‐scattering sensor recently developed for real‐time identification of multiple colonies has shown great promise for distinguishing bacteria cultures. The classification approach currently used with this system relies on supervised learning. For accurate classification of bacterial pathogens, the training library should be exhaustive, i.e., should consist of samples of all possible pathogens. Yet, the sheer number of existing bacterial serovars and more importantly the effect of their high mutation rate would not allow for a practical and manageable training. In this study, we propose a Bayesian approach to learning with a nonexhaustive training dataset for automated detection of unknown bacterial serovars, i.e., serovars for which no samples exist in the training library. The main contribution of our work is the Wishart conjugate priors defined over class distributions. This allows us to employ the prior information obtained from known classes to make inferences about unknown classes as well. By this means, we identify new classes of informational value and dynamically update the training dataset with these classes to make it increasingly more representative of the sample population. This results in a classifier with improved predictive performance for future samples. We evaluated our approach on a 28‐class bacteria dataset and also on the benchmark 26‐class letter recognition dataset for further validation. The proposed approach is compared against state‐of‐the‐art involving density‐based approaches and support vector domain description, as well as a recently introduced Bayesian approach based on simulated classes. Copyright © 2010 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 3: 289‐301, 2010

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research