A Compressed Self-index Using a Ziv-Lempel Dictionary | Zendy

Luís M. S.​Russo | Zendy; Arlindo L. Oliveira | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A Compressed Self-index Using a Ziv-Lempel Dictionary

Author(s) -

Luís M. S.Russo,

Arlindo L. Oliveira

Publication year - 2006

Publication title -

lecture notes in computer science

Language(s) - English

Resource type - Book series

SCImago Journal Rank - 0.249

H-Index - 400

eISSN - 1611-3349

pISSN - 0302-9743

DOI - 10.1007/11880561_14

Subject(s) - substring , compressed suffix array , suffix , computer science , suffix tree , algorithm , alphabet , entropy (arrow of time) , data compression , data structure , combinatorics , mathematics , linguistics , philosophy , programming language , physics , quantum mechanics

A compressed full-text self-index for a text T, of size u, is a data structure used to search patterns P, of size m, in T that requires reduced space, i.e. that depends on the empirical entropy (Hk, H0) of T, and is, furthermore, able to reproduce any substring of T. In this paper we present a new compressed self-index able to locate the occurrences of P in O((m+occ)logn) time, where occ is the number of occurrences and σ the size of the alphabet of T. The fundamental improvement over previous LZ78 based indexes is the reduction of the search time dependency on m from O(m2) to O(m). To achieve this result we point out the main obstacle to linear time algorithms based on LZ78 data compression and expose and explore the nature of a recurrent structure in LZ-indexes, the $\mathcal{T}_{78}$ suffix tree. We show that our method is very competitive in practice by comparing it against the LZ-Index, the FM-index and a compressed suffix array.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research