SPRING: a next-generation compressor for FASTQ data
Author(s) -
Shubham Chandak,
Kedar Tatwawadi,
Idoia Ochoa,
Mikel Hernáez,
Tsachy Weissman
Publication year - 2018
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/bty1015
Subject(s) - computer science , lossless compression , scalability , lossy compression , data compression , data mining , redundancy (engineering) , identifier , compression (physics) , database , artificial intelligence , computer network , operating system , materials science , composite material
High-Throughput Sequencing technologies produce huge amounts of data in the form of short genomic reads, associated quality values and read identifiers. Because of the significant structure present in these FASTQ datasets, general-purpose compressors are unable to completely exploit much of the inherent redundancy. Although there has been a lot of work on designing FASTQ compressors, most of them lack in support of one or more crucial properties, such as support for variable length reads, scalability to high coverage datasets, pairing-preserving compression and lossless compression.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom