z-logo
open-access-imgOpen Access
An Improvement of Choosing Map-join Candidates in Hive
Author(s) -
Fang Wang,
Yong Shi
Publication year - 2012
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2012.04.225
Subject(s) - computer science , join (topology) , data mining , information retrieval , artificial intelligence , database , mathematics , combinatorics
Currently Hive uses the size of tables on disk to determine if a common-join should be converted into a map-join. Our experiments demonstrate that this is a conservative decision criteria which will fail to identify optimization opportunities close to the decision frontier. Our implementation differs from the current implementation in the way we identify optimization candidates. By pre-computing hashtable sizes and adjusting them with per-query selectivity factors, we are able to choose optimization candidates directly as a function of expected hashtable size, factoring out the conservative file size criteria. In this paper we show that this approach results in more common-join to map-join optimizations and can provide average speedups of up to 1.30 in certain scenarios

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom