z-logo
open-access-imgOpen Access
Integrating Low-latency Analysis into HPC System Monitoring
Author(s) -
Ramin Izadpanah,
Nichamon Naksinehaboon,
Jim Brandt,
Ann Gentile,
Damian Dechev
Publication year - 2018
Publication title -
osti oai (u.s. department of energy office of scientific and technical information)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/3225058.3225086
Subject(s) - computer science , latency (audio) , variety (cybernetics) , component (thermodynamics) , raw data , focus (optics) , system monitoring , supercomputer , real time computing , data science , distributed computing , operating system , telecommunications , physics , artificial intelligence , optics , thermodynamics , programming language
The growth of High Performance Computer (HPC) systems increases the complexity with respect to understanding resource utilization, system management, and performance issues. While raw performance data is increasingly exposed at the component level, the usefulness of the data is dependent on the ability to do meaningful analysis on actionable timescales. However, current system monitoring infrastructures largely focus on data collection, with analysis performed off-system in post-processing mode. This increases the time required to provide analysis and feedback to a variety of consumers. In this work, we enhance the architecture of a monitoring system used on large-scale computational platforms, to integrate streaming analysis capabilities at arbitrary locations within its data collection, transport, and aggregation facilities. We leverage the flexible communication topology of the monitoring system to enable placement of transformations based on overhead concerns, while still enabling low-latency exposure on node. Our design internally supports and exposes the raw and transformed data uniformly for both node level and off-system consumers. We show the viability of our implementation for a case with production-relevance: run-time determination of the relative per-node files system demands.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom