Application of Filters to Multiway Joins in MapReduce | Zendy

Taewhi Lee | Zendy; Dong-Hyuk Im | Zendy; Hangkyu Kim | Zendy; Hyoung-Joo Kim | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Application of Filters to Multiway Joins in MapReduce

Author(s) -

Taewhi Lee,

Dong-Hyuk Im,

Hangkyu Kim,

Hyoung-Joo Kim

Publication year - 2014

Publication title -

mathematical problems in engineering

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.262

H-Index - 62

eISSN - 1026-7077

pISSN - 1024-123X

DOI - 10.1155/2014/249418

Subject(s) - joins , computer science , join (topology) , set (abstract data type) , database , data mining , parallel computing , programming language , mathematics , combinatorics

Joining multiple datasets in MapReduce may amplify the disk and network overheads because intermediate join results have to be written to the underlying distributed file system, or map output records have to be replicated multiple times. This paper proposes a method for applying filters based on the processing order of input datasets, which is appropriate for the two types of multiway joins: common attribute joins and distinct attribute joins. The number of redundant records filtered depends on the processing order. In common attribute joins, the input records do not need to be replicated, so a set of filters is created, which are applied in turn. In distinct attribute joins, the input records have to be replicated, so multiple sets of filters need to be created, which depend on the number of join attributes. The experimental results showed that our approach outperformed a cascade of two-way joins and basic multiway joins in cases where small portions of input datasets were joined.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research