Impact of lossy compression of nanopore raw signal data on basecalling and consensus accuracy
Author(s) -
Shubham Chandak,
Kedar Tatwawadi,
Srivatsan Sridhar,
Tsachy Weissman
Publication year - 2020
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btaa1017
Subject(s) - lossy compression , lossless compression , computer science , data compression , reduction (mathematics) , benchmark (surveying) , nanopore sequencing , compression (physics) , algorithm , artificial intelligence , mathematics , materials science , biochemistry , chemistry , geometry , geodesy , genome , composite material , gene , geography
Nanopore sequencing provides a real-time and portable solution to genomic sequencing, enabling better assembly, structural variant discovery and modified base detection than second generation technologies. The sequencing process generates a huge amount of data in the form of raw signal contained in fast5 files, which must be compressed to enable efficient storage and transfer. Since the raw data is inherently noisy, lossy compression has potential to significantly reduce space requirements without adversely impacting performance of downstream applications.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom