Premium
Thermal‐aware task assignments in high performance computing clusters
Author(s) -
Taneja Shubbhi,
Kulkarni Sanjay,
Zhou Yi,
Qin Xiao
Publication year - 2017
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.4206
Subject(s) - computer science , job scheduler , cluster (spacecraft) , overhead (engineering) , scheduling (production processes) , schedule , central processing unit , node (physics) , operating system , parallel computing , distributed computing , real time computing , cloud computing , engineering , operations management , structural engineering
Summary Cluster‐level thermal management has gained much attention over the past decade due to rising cooling costs associated with data centers. In this research, we propose and implement a static scheduler called SSched and a dynamic one named DSched. These 2 algorithms schedule jobs based on CPU and disk temperatures of a Hadoop cluster's nodes. Our schedulers rely on a monitoring mechanism to keep track of CPU and disk utilization, maintaining CPU and disk temperatures below a threshold through thermal‐aware scheduling decisions. To facilitate the design of SSched and DSched, we classify jobs into the CPU‐intensive and disk‐intensive categories. When a job arrives, SSched retrieves the utilization stats from a profiled log, estimates the thermal behavior, and places the job on NodeManager to minimize thermal impacts. Unlike SSched, DSched improves thermal efficiency of Hadoop clusters through dynamic load balancing. DSched keeps track of the coolest and hottest nodes in the cluster; tasks are migrated from hot nodes into cool ones if any hot spot is detected. To evaluate the effectiveness of our schedulers, we keep track of average CPU and disk temperatures in a node, managing an optimal outlet temperature across a cluster. We demonstrate that compared with the traditional Hadoop scheduler, SSched and DSched achieve approximately 15% savings in terms of cooling cost with little performance overhead.