Generate‐map‐reduce: An extension to map‐reduce to support shared data and recursive computations | Zendy

Dharanipragada Janakiram | Zendy; Iyer Geeta | Zendy; Kailasam Sriram | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Generate‐map‐reduce: An extension to map‐reduce to support shared data and recursive computations

Author(s) -

Dharanipragada Janakiram,

Iyer Geeta,

Kailasam Sriram

Publication year - 2014

Publication title -

concurrency and computation: practice and experience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.309

H-Index - 67

eISSN - 1532-0634

pISSN - 1532-0626

DOI - 10.1002/cpe.3018

Subject(s) - computer science , speedup , parallel computing , computation , cluster analysis , distributed memory , data parallelism , theoretical computer science , shared memory , distributed computing , parallelism (grammar) , algorithm , machine learning

SUMMARY It is difficult to express the parallelism present in complex computations by using existing higher level abstractions such as MapReduce and Dryad. These computations include applications from wide variety of domains, like Artificial Intelligence, Decision Tree Algorithms, Association Rule Mining, Recommender Systems, Graph Algorithms, Clustering Algorithms, Compute Intensive Scientific Workflows, Optimization Algorithms, and so forth. Their execution graphs introduce new challenges in terms of programmer expressibility and runtime performance such as iterative and recursive computations, shared communication model, and so forth. We propose an extension to MapReduce, called Generate‐Map‐Reduce (GMR), targeted towards modeling these applications. GMR introduces a new Generate abstraction into the MapReduce framework that captures recursive computations. The runtime also supports iterative jobs and a distributed communication model by using shared data structures. We illustrate recursive computations with GMR by modeling complex applications such as simulated annealing, A* search, and adaptive quadrature computation that require recursive spawning of new tasks to handle variable degree of parallelism. GMR runtime supports caching of common data across iterations in memory and local disks. We illustrate how this caching helps in achieving significant speedup for iterative computations by modeling k ‐means clustering. Copyright © 2013 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research