Premium
Bonsai: A compact representation of trees
Author(s) -
Darragh John J.,
Cleary John G.,
Witten Ian H.
Publication year - 1993
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.4380230305
Subject(s) - trie , computer science , hash function , ternary search tree , node (physics) , representation (politics) , theoretical computer science , tree (set theory) , weight balanced tree , hash table , data structure , algorithm , perfect hash function , set (abstract data type) , search tree , rank (graph theory) , discrete mathematics , binary search tree , binary tree , tree structure , mathematics , combinatorics , search algorithm , programming language , interval tree , structural engineering , politics , law , political science , engineering
This paper shows how trees can be stored in a very compact form, called ‘Bonsai’, using hash tables. A method is described that is suitable for large trees that grow monotonically within a predefined maximum size limit. Using it, pointers in any tree can be represented within 6 + [log 2 n ] bits per node where n is the maximum number of children a node can have. We first describe a general way of storing trees in hash tables, and then introduce the idea of compact hashing which underlies the Bonsai structure. These two techniques are combined to give a compact representation of trees, and a practical methodology is set out to permit the design of these structures. The new representation is compared with two conventional tree implementations in terms of the storage required per node. Examples of programs that must store large trees within a strict maximum size include those that operate on trie structures derived from natural language text. We describe how the Bonsai technique has been applied to the trees that arise in text compression and adaptive prediction, and include a discussion of the design parameters that work well in practice.