Sapling: accelerating suffix array queries with learned data models
Author(s) -
Melanie Kirsche,
Arun Das,
Michael C. Schatz
Publication year - 2020
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btaa911
Subject(s) - computer science , suffix , suffix array , memory footprint , source code , data structure , binary number , compressed suffix array , binary code , speedup , suffix tree , data mining , theoretical computer science , parallel computing , programming language , philosophy , linguistics , arithmetic , mathematics
As genomic data becomes more abundant, efficient algorithms and data structures for sequence alignment become increasingly important. The suffix array is a widely used data structure to accelerate alignment, but the binary search algorithm used to query, it requires widespread memory accesses, causing a large number of cache misses on large datasets.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom