MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud | Zendy

Roberto R. Expósito | Zendy; Jorge Veiga | Zendy; Jorge GonzálezDomínguez | Zendy; Juan Touriño | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud

Author(s) -

Roberto R. Expósito,

Jorge Veiga,

Jorge GonzálezDomínguez,

Juan Touriño

Publication year - 2017

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/btx307

Subject(s) - cloud computing , computer science , software , database , computational biology , information retrieval , world wide web , operating system , biology

This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted MapReduce programming model to fully exploit Big Data technologies on cloud-based infrastructures. Written in Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for scalable Big Data processing. On a 16-node cluster deployed on the Amazon EC2 cloud platform, MarDRe is up to 8.52 times faster than a representative state-of-the-art tool.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research