Premium
Maximum Entropy Summary Trees
Author(s) -
Karloff Howard,
Shirley Kenneth E.
Publication year - 2013
Publication title -
computer graphics forum
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.578
H-Index - 120
eISSN - 1467-8659
pISSN - 0167-7055
DOI - 10.1111/cgf.12094
Subject(s) - k ary tree , interval tree , tree (set theory) , computer science , entropy (arrow of time) , computation , search tree , algorithm , mathematics , theoretical computer science , tree structure , combinatorics , binary tree , search algorithm , physics , quantum mechanics
Given a very large, node‐weighted, rooted tree on, say, n nodes, if one has only enough space to display a k‐node summary of the tree, what is the most informative way to draw the tree? We define a type of weighted tree that we call a summary tree of the original tree that results from aggregating nodes of the original tree subject to certain constraints. We suggest that the best choice of which summary tree to use (among those with a fixed number of nodes) is the one that maximizes the information‐theoretic entropy of a natural probability distribution associated with the summary tree, and we provide a (pseudopolynomial‐time) dynamic‐programming algorithm to compute this maximum entropy summary tree, when the weights are integral. The result is an automated way to summarize large trees and retain as much information about them as possible, while using (and displaying) only a fraction of the original node set. We illustrate the computation and use of maximum entropy summary trees on five real data sets whose weighted tree representations vary widely in structure. We also provide an additive approximation algorithm and a greedy heuristic that are faster than the optimal algorithm, and generalize to trees with real‐valued weights.