CalcGen Sequence Assembler Using a Spatio-temporally Efficient DNA Sequence Search Algorithm
Author(s) -
Kyong Oh Yoon,
SungBae Cho
Publication year - 2013
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2013.10.016
Subject(s) - computer science , de bruijn sequence , algorithm , sequence (biology) , k mer , hash function , hybrid genome assembly , sequence assembly , integer (computer science) , dna sequencing , theoretical computer science , reference genome , dna , mathematics , biochemistry , gene expression , chemistry , computer security , transcriptome , discrete mathematics , biology , gene , genetics , programming language
The advent of ultra-high-throughput sequencing technology produces an enormous amount of bio-sequence information. Also, the current advances in the bio-industry bring forward the era of personalized medicine using individual genome information. However, the analysis of massive number of bio-sequences requires large storage, so that analysis sometimes needs supercomputer and novel software that can handle such volume of sequence information. For that type of analysis, several sequence match algorithms have been devised in terms of alignment and assembly, which are fundamental for analyzing bio- sequences. Those algorithms regard nucleotide sequences as strings and compare characters one-by-one during analysis of sequences. They use hash index tables, de Bruijn graph, Burrows-Wheeler transform method, and so on. In this paper, for time and space efficient DNA searching, we propose a simple algorithm that transforms base sequence into k-mer integer array and then we analyze the integer array transformed by unit search operator and non-unit search operator, resulting in a storage space reduction of about 0.28 fold. Furthermore, based on the proposed algorithm, we have developed a sequence analysis program called CalcGen assembler, and show the usefulness of the program with several experiments
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom