z-logo
Premium
Towards self‐caring MapReduce: a study of performance penalties under faults
Author(s) -
Kadirvel Selvi,
Fortes José A.B.
Publication year - 2013
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.3044
Subject(s) - testbed , computer science , fault tolerance , distributed computing , focus (optics) , node (physics) , scaling , resource (disambiguation) , service level agreement , empirical research , fault (geology) , cloud computing , operating system , computer network , engineering , philosophy , physics , geometry , mathematics , structural engineering , epistemology , seismology , geology , optics
Summary Self‐caring IT systems are those that can proactively avoid system failures rather than reactively handle failures after they have occurred. In this paper, we focus on failures in which a MapReduce job is unable to execute within an service‐level agreement based completion time. The existing fault‐tolerance capability provided by MapReduce frameworks such as Hadoop, is simple and the penalty associated with handling faults could potentially lead to excessive job execution times. Our goal in this paper is to bring out the severity of this penalty for different job and framework parameters. We quantitatively evaluate the penalty in execution time associated with node faults using the MRPerf simulator. We then perform an empirical study of penalties on a virtualized testbed consisting of Xen domains, by varying system characteristics along four dimensions: hardware, application, dataset, and fault types. Through simulation and empirical results, we show that job‐completion‐time service‐level agreement violations can be reduced using dynamic resource scaling. Scaling leverages, the elastic properties of a virtualized environment, to mitigate execution time penalties and hence proactively avoids a potential job failure. We show that using resource scaling, performance penalties can be decreased to less than 5% of the no‐fault execution time, at minimal additional cost. Copyright © 2013 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here