RPK-table based efficient algorithm for join-aggregate query on MapReduce | Zendy

Zhan Li | Zendy; Qi Feng | Zendy; Wei Chen | Zendy; Tengjiao Wang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

RPK-table based efficient algorithm for join-aggregate query on MapReduce

Author(s) -

Zhan Li,

Qi Feng,

Wei Chen,

Tengjiao Wang

Publication year - 2016

Publication title -

caai transactions on intelligence technology

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.613

H-Index - 15

eISSN - 2468-6557

pISSN - 2468-2322

DOI - 10.1016/j.trit.2016.03.008

Subject(s) - computer science , join (topology) , aggregate (composite) , key (lock) , overhead (engineering) , table (database) , query optimization , online aggregation , process (computing) , data mining , database , materialized view , sargable , distributed computing , view , information retrieval , web search query , database design , search engine , operating system , mathematics , materials science , combinatorics , composite material

Join-aggregate is an important and widely used operation in database system. However, it is time-consuming to process join-aggregate query in big data environment, especially on MapReduce framework. The main bottlenecks contain two aspects: lots of I/O caused by temporary data and heavy communication overhead between different data nodes during query processing. To overcome such disadvantages, we design a data structure called Reference Primary Key table (RPK-table) which stores the relationship of primary key and foreign key between tables. Based on this structure, we propose an improved algorithm on MapReduce framework for join-aggregate query. Experiments on TPC-H dataset demonstrate that our algorithm outperforms existing methods in terms of communication cost and query response time

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research