Analyzing Cost Parameters Affecting Map Reduce Application Performance
Author(s) -
Narinder K. Seera,
S. Taruna
Publication year - 2016
Publication title -
international journal of information technology and computer science
Language(s) - English
Resource type - Journals
eISSN - 2074-9015
pISSN - 2074-9007
DOI - 10.5815/ijitcs.2016.08.06
Subject(s) - computer science , scalability , big data , flexibility (engineering) , distributed computing , fault tolerance , data mining , database , statistics , mathematics
Recently, big data analysis has become an imperative task for many big companies. Map-Reduce, an emerging distributed computing paradigm, is known as a promising architecture for big data analytics on commodity hardware. Map-Reduce, and its open source implementation Hadoop, have been extensively accepted by several companies due to their salient features such as scalability, elasticity, fault-tolerance and flexibility to handle big data. However, these benefits entail a considerable performance sacrifice. The performance of a Map-Reduce application depends on various factors including the size of the input data set, cluster resource settings etc. A clear understanding of the factors that affect Map-Reduce application performance and the cost associated with those factors is required. In this paper, we study different performance parameters and an existing Cost Optimizer that computes the cost of Map-Reduce job execution. The cost based optimizer also considers various configuration parameters available in Hadoop that affect performance of these programs. This paper is an attempt to analyze the Map-Reduce application performance and identifying the key factors affecting the cost and performance of executing Map-Reduce applications.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom