Toward resilient algorithms and applications | Zendy

Michael A. Heroux | Zendy

AI Assistant Blog Pricing

Open Access

Toward resilient algorithms and applications

Author(s) -

Michael A. Heroux

Publication year - 2013

Publication title -

osti oai (u.s. department of energy office of scientific and technical information)

Language(s) - English

Resource type - Conference proceedings

DOI - 10.1145/2465813.2465814

Subject(s) - computer science , key (lock) , reliability (semiconductor) , state (computer science) , scale (ratio) , distributed computing , algorithm , computer security , physics , power (physics) , quantum mechanics

Large-scale computing platforms have always dealt with unreliability coming from many sources. In contrast applications for large-scale systems have generally assumed a fairly simplistic failure model: The computer is a reliable digital machine, with consistent execution time and infrequent failures that can be handled by occasionally storing a checkpoint of application state and restarting from that saved state if the system fails. Many computing experts, and several key technology trends indicate that the current simplistic application view of a high-end system is no longer feasible. Instead, algorithms and application developers must adopt more complex models for system reliability and adapt algorithms and implementation to be more resilient in the presence of failures and increased failure detection and correction. In this talk we present motivation for moving away from a checkpoint-restart-only model and discuss several new models for resilience, including latency tolerance, local recovery from local failure and selective reliability. We also discuss strategies for designing new algorithms and applications, and some of the required system and programming environment features.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom

About

About Careers Publisher Partners Contact Us Our institutional solutions Get Organisational Trial or Quote

Learn

FAQs Blog Terms of Use Privacy Policy

Download the Zendy App

Discover

Explore

Home ZAIA Blog