Workload-Optimal Histograms on Streams
Author(s) -
S. Muthukrishnan,
Michael Strauss,
Xue Zheng
Publication year - 2005
Publication title -
lecture notes in computer science
Language(s) - English
Resource type - Book series
SCImago Journal Rank - 0.249
H-Index - 400
eISSN - 1611-3349
pISSN - 0302-9743
ISBN - 3-540-29118-0
DOI - 10.1007/11561071_65
Subject(s) - computer science , histogram , partition (number theory) , query optimization , query plan , data mining , workload , streaming data , piecewise , information retrieval , sargable , artificial intelligence , web search query , search engine , mathematical analysis , mathematics , combinatorics , image (mathematics) , operating system
A histogram is a piecewise-constant approximation of an observed data distribution. A histogram is used as a small-space, approximate synopsis of the underlying data distribution, which is often too large to be stored precisely. Histograms have found many applications in database management systems, perhaps most commonly for query selectivity estimation in query optimizers [1], but have also found applications in approximate query answering [2], load balancing in parallel join execution [3], mining time-series data [4], partition-based temporal join execution, query pro.ling for user feedback, etc. Ioannidis has a nice overview of the history of histograms, their applications, and their use in commercial DBMSs [5]. Also, Poosala’s thesis provides a systematic treatment of different types of histograms [3].
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom