Multiple-Disease Detection and Classification across Cohorts via Microbiome Search
Author(s) -
Xiaoquan Su,
Gongchao Jing,
Zheng Sun,
Lu Liu,
Zhenjiang Zech Xu,
Daniel McDonald,
Zengbin Wang,
Honglei Wang,
Antonio González,
Yufeng Zhang,
Shi Huang,
Gavin Huttley,
Rob Knight,
Jian Xu
Publication year - 2020
Publication title -
msystems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.931
H-Index - 39
ISSN - 2379-5077
DOI - 10.1128/msystems.00150-20
Subject(s) - microbiome , novelty detection , outlier , novelty , identification (biology) , amplicon , disease , computer science , amplicon sequencing , human microbiome project , a priori and a posteriori , computational biology , anomaly detection , artificial intelligence , data mining , pattern recognition (psychology) , biology , human microbiome , 16s ribosomal rna , bioinformatics , gene , medicine , genetics , polymerase chain reaction , pathology , botany , philosophy , theology , epistemology
Microbiome-based disease classification depends on well-validated disease-specific models or a priori organismal markers. These are lacking for many diseases. Here, we present an alternative, search-based strategy for disease detection and classification, which detects diseased samples via their outlier novelty versus a database of samples from healthy subjects and then compares these to databases of samples from patients. Our strategy's precision, sensitivity, and speed outperform model-based approaches. In addition, it is more robust to platform heterogeneity and to contamination in 16S rRNA gene amplicon data sets. This search-based strategy shows promise as an important first step in microbiome big-data-based diagnosis. IMPORTANCE Here, we present a search-based strategy for disease detection and classification, which detects diseased samples via their outlier novelty versus a database of samples from healthy subjects and then compares them to databases of samples from patients. This approach enables the identification of microbiome states associated with disease even in the presence of different cohorts, multiple sequencing platforms, or significant contamination.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom