Propose a Substitution Model for DNA Data Compression | Zendy

E. Ayad | Zendy; Ali Kamal | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Propose a Substitution Model for DNA Data Compression

Author(s) -

E. Ayad,

Ali Kamal

Publication year - 2018

Publication title -

international journal of computer applications

Language(s) - English

Resource type - Journals

ISSN - 0975-8887

DOI - 10.5120/ijca2018916484

Subject(s) - computer science , substitution (logic) , compression (physics) , programming language , thermodynamics , physics

One of the most difficult challenges in lossless data compression is finding the right model for the Deoxyribonucleic Acid (DNA) compression. DNA sequences include four chemical bases: adenine (A), guanine (G), cytosine (C), and thymine (T) where the information in DNA is stored as a code made up of these four chemical bases and these sequences show these are not random, if they are totally random then store them in two bits, this is the most efficient and logical way. This paper proposed an algorithm called A2 for DNA data compression. The proposed algorithm consists of four stages to build a substitutional model. The first stage used a modified version of run-length coding, in second and third stages mapping model for formatting data to be suitable for the final stage fed into Burrows-Wheeler Transform to use permutation technique that group related symbols as possible to improve dictionary coding using Lempel-Ziv (LZ77) and output file stored as (.a2) extension. The A2 algorithm implemented and tested on data from GenBank and shows acceptable file size and processing time ratio.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research