Premium
Reproducibility of pyrosequencing data for biodiversity assessment in complex communities
Author(s) -
Zhan Aibin,
He Song,
Brown Emily A.,
Chain Frédéric J.J.,
Therriault Thomas W.,
Abbott Cathryn L.,
Heath Daniel D.,
Cristescu Melania E.,
MacIsaac Hugh J.
Publication year - 2014
Publication title -
methods in ecology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.425
H-Index - 105
ISSN - 2041-210X
DOI - 10.1111/2041-210x.12230
Subject(s) - operational taxonomic unit , biology , biodiversity , unifrac , reproducibility , pyrosequencing , abundance (ecology) , taxonomic rank , replicate , ecology , zooplankton , statistics , taxon , genetics , 16s ribosomal rna , bacteria , mathematics , gene
Summary High‐throughput sequencing is rapidly becoming a popular method to profile complex communities and has generated deep insights into community biodiversity. However, the reproducibility of this method for biodiversity assessment remains largely unexplored. Here we evaluated reproducibility by analysing 454 pyrosequenced biological replicates of two complex plankton communities collected from one freshwater port and one marine port. We also tested whether reproducibility potentially influences biodiversity estimates, notably α‐ and β‐diversity. Our evaluation of reproducibility revealed a complex scenario, having both technical and biological significance. At the Operational Taxonomic Unit ( OTU ) level, reproducibility was 100% for high‐abundance OTU s (>100 sequences), although it was lower for low‐abundance OTU s, and sometimes <25% for singletons. BLAST searches showed that >88% of irreproducible OTU s had high sequence similarity to existing records, suggesting that some singletons may reflect rare lineages/genotypes in communities. However, spurious amplification of distantly related taxonomic groups generated mainly low‐abundance OTU s that were characterized by low reproducibility. At a broad taxonomic level (i.e. order level), reproducibility decreased as the abundance of OTU s decreased and was particularly low for distantly related taxonomic groups such as algae and protists that were not the targets of our zooplankton biodiversity survey. At a lower taxonomical level (i.e. family‐level), overall reproducibility was high (>80%) for crustaceans, the dominant group in zooplankton samples. Therefore, we suggest that random variation during both sample collection and sequencing processes can be responsible for low reproducibility. Our analyses also suggest that random sampling processes may influence both α‐ and β‐diversity estimates. Our results add to growing evidence that caution needs to be applied when designing and interpreting experiments utilizing high‐throughput sequencing data for biodiversity assessments. Technical replicates are needed to statistically correct intra‐sample variation, while field‐based replicate samples are desirable to substantiate results. An overestimation of species diversity can occur when OTU s are uniquely characterized by spuriously amplified sequences and errors/artifacts. Therefore, careful management of low‐abundance OTU s is required to reveal unique/rare lineages. Our results suggest that further studies are needed to determine the ecological significance of low‐abundance OTU s in complex communities.