z-logo
Premium
Energy consumption analysis of data stream processing: a benchmarking approach
Author(s) -
Dayarathna Miyuru,
Li Yuanlong,
Wen Yonggang,
Fan Rui
Publication year - 2017
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.2458
Subject(s) - computer science , stream processing , energy consumption , efficient energy use , data center , benchmarking , scheduling (production processes) , benchmark (surveying) , data stream , real time computing , distributed computing , spark (programming language) , overhead (engineering) , data processing , embedded system , database , computer network , operating system , engineering , telecommunications , electrical engineering , operations management , geodesy , marketing , geography , business , programming language
Summary Energy efficiency of data analysis systems has become a very important issue in recent times because of the increasing costs of data center operations. Although distributed streaming workloads have increasingly been present in modern data centers, energy‐efficient scheduling of such applications remains as a significant challenge. In this paper, we conduct an energy consumption analysis of data stream processing systems in order to identify their energy consumption patterns. We follow stream system benchmarking approach to solve this issue. Specifically, we implement Linear Road benchmark on six stream processing environments (S4, Storm, ActiveMQ, Esper, Kafka, and Spark Streaming) and characterize these systems' performance on a real‐world data center. We study the energy consumption characteristics of each system with varying number of roads as well as with different types of component layouts. We also use a microbenchmark to capture raw energy consumption characteristics. We observed that S4, Esper, and Spark Streaming environments had highest average energy consumption efficiencies compared with the other systems. Using a neural networkbased technique with the power/performance information gathered from our experiments, we developed a model for the power consumption behavior of a streaming environment. We observed that energy‐efficient execution of streaming application cannot be specifically attributed to the system CPU usage. We observed that communication between compute nodes with moderate tuple sizes and scheduling plans with balanced system overhead produces better power consumption behaviors in the context of data stream processing systems. Copyright © 2016 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here