genozip: a fast and efficient compression tool for VCF files
Author(s) -
Divon Lan,
Raymond Tobler,
Yassine Souilmi,
Bastien Llamas
Publication year - 2020
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btaa290
Subject(s) - lossless compression , computer science , data compression , compression (physics) , field (mathematics) , lossy compression , compression ratio , computer engineering , data mining , operating system , algorithm , engineering , materials science , mathematics , automotive engineering , pure mathematics , composite material , internal combustion engine
Motivation genozip is a new lossless compression tool for Variant Call Format (VCF) files. By applying field-specific algorithms and fully utilizing the available computational hardware, genozip achieves the highest compression ratios amongst existing lossless compression tools known to the authors, at speeds comparable with the fastest multi-threaded compressors. Availability and implementation genozip is freely available to non-commercial users. It can be installed via conda-forge, Docker Hub, or downloaded from github.com/divonlan/genozip. Supplementary information Supplementary data are available at Bioinformatics online.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom