Using Performance Measurements to Improve MapReduce Algorithms | Zendy

Todd Plantenga | Zendy; Yung Ryn Choe | Zendy; Ann S. Yoshimura | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Using Performance Measurements to Improve MapReduce Algorithms

Author(s) -

Todd Plantenga,

Yung Ryn Choe,

Ann S. Yoshimura

Publication year - 2012

Publication title -

procedia computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.334

H-Index - 76

ISSN - 1877-0509

DOI - 10.1016/j.procs.2012.04.210

Subject(s) - computer science , sophistication , focus (optics) , visualization , data mining , software , big data , algorithm , database , machine learning , operating system , social science , physics , sociology , optics

The Hadoop MapReduce software environment is used for parallel processing of distributively stored data. Data mining algorithms of increasing sophistication are being implemented in MapReduce, bringing new challenges for performance measurement and tuning. We focus on analyzing a job after completion, utilizing information collected from Hadoop logs and machine metrics. Our analysis, inspired by [1] [2], goes beyond conventional Hadoop Job-Tracker analysis by integrating more data and providing web browser visualization tools. This paper describes examples where measurements helped diagnose subtle issues and improve algorithm performance. Examples demonstrate the value of correlating detailed information that is not usually examined in standard Hadoop performance displays

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research