
To assemble or not to resemble—A validated Comparative Metatranscriptomics Workflow (CoMW)
Author(s) -
Muhammad Zohaib Anwar,
Anders Lanzén,
Toke BangAndreasen,
Carsten Suhr Jacobsen
Publication year - 2019
Publication title -
gigascience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.947
H-Index - 54
ISSN - 2047-217X
DOI - 10.1093/gigascience/giz096
Subject(s) - workflow , contig , computer science , sequence assembly , modular design , benchmarking , computational biology , precision and recall , data mining , artificial intelligence , biology , genome , gene , database , genetics , gene expression , transcriptome , marketing , business , operating system
Metatranscriptomics has been used widely for investigation and quantification of microbial communities' activity in response to external stimuli. By assessing the genes expressed, metatranscriptomics provides an understanding of the interactions between different major functional guilds and the environment. Here, we present a de novo assembly-based Comparative Metatranscriptomics Workflow (CoMW) implemented in a modular, reproducible structure. Metatranscriptomics typically uses short sequence reads, which can either be directly aligned to external reference databases ("assembly-free approach") or first assembled into contigs before alignment ("assembly-based approach"). We also compare CoMW (assembly-based implementation) with an assembly-free alternative workflow, using simulated and real-world metatranscriptomes from Arctic and temperate terrestrial environments. We evaluate their accuracy in precision and recall using generic and specialized hierarchical protein databases.