z-logo
open-access-imgOpen Access
Data structure set-trie for storing and querying sets: Theoretical and empirical analysis
Author(s) -
Iztok Savnik,
Mikita Akulich,
Matjaž Krnc,
Riste Škrekovski
Publication year - 2021
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0245122
Subject(s) - trie , computer science , cardinality (data modeling) , set (abstract data type) , data mining , data set , theoretical computer science , data structure , probabilistic logic , algorithm , artificial intelligence , programming language
Set containment operations form an important tool in various fields such as information retrieval, AI systems, object-relational databases, and Internet applications. In the paper, a set-trie data structure for storing sets is considered, along with the efficient algorithms for the corresponding set containment operations. We present the mathematical and empirical study of the set-trie. In the mathematical study, the relevant upper-bounds on the efficiency of its expected performance are established by utilizing a natural probabilistic model. In the empirical study, we give insight into how different distributions of input data impact the efficiency of set-trie. Using the correct parameters for those randomly generated datasets, we expose the key sources of the input sensitivity of set-trie. Finally, the empirical comparison of set-trie with the inverted index is based on the real-world datasets containing sets of low cardinality. The comparison shows that the running time of set-trie consistently outperforms the inverted index by orders of magnitude.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here