z-logo
Premium
A study on using uncertain time series matching algorithms for MapReduce applications
Author(s) -
Rizvandi Nikzad Babaii,
Taheri Javid,
Moraveji Reza,
Zomaya Albert Y.
Publication year - 2012
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.2895
Subject(s) - computer science , scalability , tweaking , parsing , dynamic time warping , cloud computing , parallel computing , data mining , algorithm , database , artificial intelligence , operating system
SUMMARY In this paper, we study CPU utilization time patterns of several MapReduce applications. After extracting running patterns of several applications, the patterns along with their statistical information are saved in a reference database to be later used to tweak system parameters to efficiently execute future unknown applications. To achieve this goal, CPU utilization patterns of new applications along with its statistical information are compared with the already known ones in the reference database to find/predict their most probable execution patterns. Because of different pattern lengths, dynamic time warping (DTW) is utilized for such comparison; a statistical analysis is then applied to DTWs' outcomes to select the most suitable candidates. Furthermore, under a hypothesis, we also proposed another algorithm to classify applications under similar CPU utilization patterns. Finally, dependency between minimum distance/maximum similarity of applications and scalability (in both input size and number of virtual nodes) is studied. Here, we used widely used applications (WordCount, Distributed Grep, and Terasort) as well as an Exim MainLog parsing application to evaluate our hypothesis in automatic tweaking MapReduce configuration parameters in executing similar applications scalable on both size of input data and number of virtual nodes. Results are very promising and showed the effectiveness of our approach on a private cloud with up to 25 virtual nodes. Concurrency and Computation: Practice and Experience, 2012. Copyright © 2012 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here