
Bit Reduction based Compression Algorithm for DNA Sequences
Author(s) -
Rosario Gilmary,
Gowtham Krishnan Murugesan
Publication year - 2021
Publication title -
international journal of scientific research in science, engineering and technology
Language(s) - English
Resource type - Journals
eISSN - 2395-1990
pISSN - 2394-4099
DOI - 10.32628/ijsrset218529
Subject(s) - data compression , compression ratio , computer science , compression (physics) , sequence (biology) , lossless compression , algorithm , dna sequencing , reduction (mathematics) , encoding (memory) , genbank , dna , biology , mathematics , genetics , gene , artificial intelligence , chemistry , materials science , geometry , organic chemistry , composite material , combustion
Deoxyribonucleic acid called DNA is the smallest fundamental unit that bears the genetic instructions of a living organism. It is used in the up growth and functioning of all known living organisms. Current DNA sequencing equipment creates extensive heaps of genomic data. The Nucleotide databases like GenBank, size getting 2 to 3 times larger annually. The increase in genomic data outstrips the increase in storage capacity. Massive amount of genomic data needs an effectual depository, quick transposal and preferable performance. To reduce storage of abundant data and data storage expense, compression algorithms were used. Typical compression approaches lose status while compressing these sequences. However, novel compression algorithms have been introduced for better compression ratio. The performance is correlated in terms of compression ratio; ratio of the capacity of compressed file and compression/decompression time; time taken to compress/decompress the sequence. In the proposed work, the input DNA sequence is compressed by reconstructing the sequence into varied formats. Here the input DNA sequence is subjected to bit reduction. The binary output is converted to hexadecimal format followed by encoding. Thus, the compression ratio of the biological sequence is improved.