Premium
BigDataSDNSim: A simulator for analyzing big data applications in software‐defined cloud data centers
Author(s) -
Alwasel Khaled,
Calheiros Rodrigo N.,
Garg Saurabh,
Buyya Rajkumar,
Pathan Mukaddim,
Georgakopoulos Dimitrios,
Ranjan Rajiv
Publication year - 2021
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.2917
Subject(s) - big data , computer science , cloud computing , correctness , distributed computing , replication (statistics) , software , yarn , data intensive computing , data processing , programming paradigm , software defined networking , data mining , database , operating system , statistics , materials science , geometry , mathematics , grid computing , composite material , programming language , grid
The integration and crosscoordination of big data processing and software‐defined networking (SDN) are vital for improving the performance of big data applications. Various approaches for combining big data and SDN have been investigated by both industry and academia. However, empirical evaluations of solutions that combine big data processing and SDN are extremely costly and complicated. To address the problem of effective evaluation of solutions that combine big data processing with SDN, we present a new, self‐contained simulation tool named BigDataSDNSim that enables the modeling and simulation of the big data management system YARN, its related programming models MapReduce, and SDN‐enabled networks in a cloud computing environment. BigDataSDNSim supports cost‐effective and easy to conduct experimentation in a controllable, repeatable, and configurable manner. The article illustrates the simulation accuracy and correctness of BigDataSDNSim by comparing the behavior and results of a real environment that combines big data processing and SDN with an equivalent simulated environment. Finally, the article presents two uses cases of BigDataSDNSim, which exhibit its practicality and features, illustrate the impact of data replication mechanisms of MapReduce in Hadoop YARN, and show the superiority of SDN over traditional networks to improve the performance of MapReduce applications.