Premium
Supervised machine learning outperforms taxonomy‐based environmental DNA metabarcoding applied to biomonitoring
Author(s) -
Cordier Tristan,
Forster Dominik,
Dufresne Yoann,
Martins Catarina I. M.,
Stoeck Thorsten,
Pawlowski Jan
Publication year - 2018
Publication title -
molecular ecology resources
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.96
H-Index - 136
eISSN - 1755-0998
pISSN - 1755-098X
DOI - 10.1111/1755-0998.12926
Subject(s) - biology , environmental dna , taxonomic rank , bioindicator , biological classification , biodiversity , taxonomy (biology) , amplicon , taxon , computational biology , dna barcoding , machine learning , ecology , evolutionary biology , computer science , genetics , polymerase chain reaction , gene
Biodiversity monitoring is the standard for environmental impact assessment of anthropogenic activities. Several recent studies showed that high‐throughput amplicon sequencing of environmental DNA ( eDNA metabarcoding) could overcome many limitations of the traditional morphotaxonomy‐based bioassessment. Recently, we demonstrated that supervised machine learning ( SML ) can be used to predict accurate biotic indices values from eDNA metabarcoding data, regardless of the taxonomic affiliation of the sequences. However, it is unknown to which extent the accuracy of such models depends on taxonomic resolution of molecular markers or how SML compares with metabarcoding approaches targeting well‐established bioindicator species. In this study, we address these issues by training predictive models upon five different ribosomal bacterial and eukaryotic markers and measuring their performance to assess the environmental impact of marine aquaculture on independent data sets. Our results show that all tested markers are yielding accurate predictive models and that they all outperform the assessment relying solely on taxonomically assigned sequences. Remarkably, we did not find any significant difference in the performance of the models built using universal eukaryotic or prokaryotic markers. Using any molecular marker with a taxonomic range broad enough to comprise different potential bioindicator taxa, SML approach can overcome the limits of taxonomy‐based eDNA bioassessment.