z-logo
open-access-imgOpen Access
Boiler: lossy compression of RNA-seq alignments using coverage vectors
Author(s) -
Jacob Pritt,
Ben Langmead
Publication year - 2016
Publication title -
nucleic acids research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 9.008
H-Index - 537
eISSN - 1362-4954
pISSN - 0305-1048
DOI - 10.1093/nar/gkw540
Subject(s) - boiler (water heating) , software , computer science , lossy compression , biology , data mining , artificial intelligence , engineering , operating system , waste management
We describe Boiler, a new software tool for compressing and querying large collections of RNA-seq alignments. Boiler discards most per-read data, keeping only a genomic coverage vector plus a few empirical distributions summarizing the alignments. Since most per-read data is discarded, storage footprint is often much smaller than that achieved by other compression tools. Despite this, the most relevant per-read data can be recovered; we show that Boiler compression has only a slight negative impact on results given by downstream tools for isoform assembly and quantification. Boiler also allows the user to pose fast and useful queries without decompressing the entire file. Boiler is free open source software available from github.com/jpritt/boiler .

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom