z-logo
open-access-imgOpen Access
Genomic Sequence Data Compression using Lempel Ziv Welch Algorithm with Indexed Multiple Dictionary
Author(s) -
A. S. Keerthy,
S. Manju Priya
Publication year - 2019
Publication title -
international journal of engineering and advanced technology
Language(s) - English
Resource type - Journals
ISSN - 2249-8958
DOI - 10.35940/ijeat.b3278.129219
Subject(s) - data compression , computer science , compression (physics) , algorithm , compression ratio , throughput , sequence (biology) , data mining , biology , engineering , telecommunications , materials science , automotive engineering , composite material , genetics , wireless , internal combustion engine
With the advancement in technology and development of High Throughput System (HTS), the amount of genomic data generated per day per laboratory across the globe is surpassing the Moore’s law. The huge amount of data generated is of concern to the biologists with respect to their storage as well as transmission across different locations for further analysis. Compression of the genomic data is the wise option to overcome the problems arising from the data deluge. This paper discusses various algorithms that exists for compression of genomic data as well as a few general purpose algorithms and proposes a LZW-based compression algorithm that uses indexed multiple dictionaries for compression. The proposed method exhibits an average compression ratio of 0.41 bits per base and an average compression time of 6.45 secs for a DNA sequence of an average size 105.9 KB.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here