z-logo
open-access-imgOpen Access
Objective and Comprehensive Evaluation of Bisulfite Short Read Mapping Tools
Author(s) -
Hong Tran,
Jacob Porter,
Ming-an Sun,
Hehuang Xie,
Liqing Zhang
Publication year - 2014
Publication title -
advances in bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.33
H-Index - 20
eISSN - 1687-8035
pISSN - 1687-8027
DOI - 10.1155/2014/472045
Subject(s) - bisulfite , computer science , information retrieval , data science , chemistry , biochemistry , dna methylation , gene expression , gene
Background . Large-scale bisulfite treatment and short reads sequencing technology allow comprehensive estimation of methylation states of Cs in the genomes of different tissues, cell types, and developmental stages. Accurate characterization of DNA methylation is essential for understanding genotype phenotype association, gene and environment interaction, diseases, and cancer. Aligning bisulfite short reads to a reference genome has been a challenging task. We compared five bisulfite short read mapping tools, BSMAP, Bismark, BS-Seeker, BiSS, and BRAT-BW, representing two classes of mapping algorithms (hash table and suffix/prefix tries). We examined their mapping efficiency (i.e., the percentage of reads that can be mapped to the genomes), usability, running time, and effects of changing default parameter settings using both real and simulated reads. We also investigated how preprocessing data might affect mapping efficiency. Conclusion . Among the five programs compared, in terms of mapping efficiency, Bismark performs the best on the real data, followed by BiSS, BSMAP, and finally BRAT-BW and BS-Seeker with very similar performance. If CPU time is not a constraint, Bismark is a good choice of program for mapping bisulfite treated short reads. Data quality impacts a great deal mapping efficiency. Although increasing the number of mismatches allowed can increase mapping efficiency, it not only significantly slows down the program, but also runs the risk of having increased false positives. Therefore, users should carefully set the related parameters depending on the quality of their sequencing data.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom