Indel-tolerant read mapping with trinucleotide frequencies using cache-oblivious kd-trees
Author(s) -
Md Pavel Mahmud,
John Wiedenhoeft,
Alexander Schliep
Publication year - 2012
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/bts380
Subject(s) - computer science , edit distance , indel , embedding , cache , homomorphic encryption , string (physics) , theoretical computer science , algorithm , parallel computing , encryption , artificial intelligence , mathematics , genetics , biology , genotype , single nucleotide polymorphism , mathematical physics , gene , operating system
Mapping billions of reads from next generation sequencing experiments to reference genomes is a crucial task, which can require hundreds of hours of running time on a single CPU even for the fastest known implementations. Traditional approaches have difficulties dealing with matches of large edit distance, particularly in the presence of frequent or large insertions and deletions (indels). This is a serious obstacle both in determining the spectrum and abundance of genetic variations and in personal genomics.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom