Compact inverted index storage using general‐purpose compression libraries | Zendy

Petri Matthias | Zendy; Moffat Alistair | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Compact inverted index storage using general‐purpose compression libraries

Author(s) -

Petri Matthias,

Moffat Alistair

Publication year - 2018

Publication title -

software: practice and experience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.437

H-Index - 70

eISSN - 1097-024X

pISSN - 0038-0644

DOI - 10.1002/spe.2556

Subject(s) - computer science , implementation , inverted index , decoding methods , compression (physics) , index (typography) , data compression , information retrieval , key (lock) , task (project management) , keyword search , data mining , algorithm , world wide web , operating system , search engine indexing , software engineering , materials science , management , economics , composite material

Summary Efficient storage of large inverted indexes is one of the key technologies that support current web search services. Here we re‐examine mechanisms for representing document‐level inverted indexes and within‐document term frequencies, including comparing specialized methods developed for this task against recent fast implementations of general‐purpose adaptive compression techniques. Experiments with the Gov2‐URL collection and a large collection of crawled news stories show that standard compression libraries can provide compression effectiveness as good as or better than previous methods, with decoding rates only moderately slower than reference implementations of those tailored approaches. This surprising outcome means that high‐performance index compression can be achieved without requiring the use of specialized implementations.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research