Premium
Sampled suffix array with minimizers
Author(s) -
Grabowski Szymon,
Raniszewski Marcin
Publication year - 2017
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.2481
Subject(s) - suffix , suffix array , sampling (signal processing) , benchmark (surveying) , alphabet , computer science , character (mathematics) , suffix tree , algorithm , scheme (mathematics) , sampling scheme , space (punctuation) , mathematics , statistics , geography , linguistics , mathematical analysis , philosophy , geometry , computer vision , operating system , geodesy , filter (signal processing) , estimator
Summary Sampling (evenly) the suffixes from the suffix array is an old idea trading the pattern search time for reduced index space. A few years ago Claude et al. showed an alphabet sampling scheme allowing for more efficient pattern searches compared with the sparse suffix array, for long enough patterns. A drawback of their approach is the requirement that sought patterns need to contain at least one character from the chosen subalphabet. In this work, we propose an alternative suffix sampling approach with only a minimum pattern length as a requirement, which is more convenient in practice. Experiments show that our algorithm (in a few variants) achieves competitive time‐space tradeoffs on most standard benchmark data. Copyright © 2017 John Wiley & Sons, Ltd.