z-logo
open-access-imgOpen Access
Inducing Suffix and LCP Arrays in External Memory
Author(s) -
Timo Bingmann,
Johannes Fischer,
Vitaly Osipov
Publication year - 2016
Publication title -
acm journal of experimental algorithmics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.494
H-Index - 37
ISSN - 1084-6654
DOI - 10.1145/2975593
Subject(s) - suffix array , suffix , generalized suffix tree , computer science , compressed suffix array , suffix tree , algorithm , sorting , auxiliary memory , construct (python library) , prefix , time complexity , parallel computing , overhead (engineering) , data structure , philosophy , linguistics , operating system , programming language , computer hardware
We consider full text index construction in external memory (EM). Our first contribution is an inducing algorithm for suffix arrays in external memory, which runs in sorting complexity. Practical tests show that this algorithm outperforms the previous best EM suffix sorter [Dementiev et al., JEA 2008] by a factor of about two in time and I/O volume. Our second contribution is to augment the first algorithm to also construct the array of longest common prefixes (LCPs). This yields a new internal memory LCP array construction algorithm and the first EM construction algorithm for LCP arrays. The overhead in time and I/O volume for this extended algorithm over plain suffix array construction is roughly two. Our algorithms scale far beyond problem sizes previously considered in the literature (text size of 80GiB using only 4GiB of RAM in our experiments).

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom