Optimizing hadoop parameter settings with gene expression programming guided PSO | Zendy

Khan Mukhtaj | Zendy; Huang Zhengwen | Zendy; Li Maozhen | Zendy; Taylor Gareth A. | Zendy; Khan Mushtaq | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Optimizing hadoop parameter settings with gene expression programming guided PSO

Author(s) -

Khan Mukhtaj,

Huang Zhengwen,

Li Maozhen,

Taylor Gareth A.,

Khan Mushtaq

Publication year - 2016

Publication title -

concurrency and computation: practice and experience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.309

H-Index - 67

eISSN - 1532-0634

pISSN - 1532-0626

DOI - 10.1002/cpe.3786

Subject(s) - computer science , particle swarm optimization , big data , task (project management) , programming paradigm , function (biology) , process (computing) , ant colony optimization algorithms , data mining , artificial intelligence , algorithm , operating system , engineering , biology , programming language , systems engineering , evolutionary biology

Summary Hadoop MapReduce has become a major computing technology in support of big data analytics. The Hadoop framework has over 190 configuration parameters, and some of them can have a significant effect on the performance of a Hadoop job. Manually tuning the optimum or near optimum values of these parameters is a challenging task and also a time consuming process. This paper optimizes the performance of Hadoop by automatically tuning its configuration parameter settings. The proposed work first employs gene expression programming technique to build an objective function based on historical job running records, which represents a correlation among the Hadoop configuration parameters. It then employs particle swarm optimization technique, which makes use of the objective function to search for optimal or near optimal parameter settings. Experimental results show that the proposed work enhances the performance of Hadoop significantly compared with the default settings. Moreover, it outperforms both rule‐of‐thumb settings and the Starfish model in Hadoop performance optimization. © 2016 The Authors. Concurrency and Computation: Practice and Experience Published by John Wiley & Sons Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research