Modular Checkpointing for Atomicity | Zendy

Lukasz Ziarek | Zendy; Philip Schatz | Zendy; Suresh Jagannathan | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Modular Checkpointing for Atomicity

Author(s) -

Lukasz Ziarek,

Philip Schatz,

Suresh Jagannathan

Publication year - 2007

Publication title -

electronic notes in theoretical computer science

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.242

H-Index - 60

ISSN - 1571-0661

DOI - 10.1016/j.entcs.2007.04.008

Subject(s) - atomicity , computer science , thread (computing) , distributed computing , modular design , debugging , concurrency , programming language , correctness , model checking , isolation (microbiology) , operating system , database transaction , microbiology and biotechnology , biology

Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execution in multi-threaded code is not obvious, however. For a thread to correctly re-execute a region of code, it must ensure that all other threads which have witnessed its unwanted effects within that region are also reverted to a meaningful earlier state. If not done properly, data inconsistencies and other undesirable behavior may result. However, automatically determining what constitutes a consistent global checkpoint is not straightforward since thread interactions are a dynamic property of the program.In this paper, we present a safe and efficient checkpointing mechanism for Concurrent ML (CML) that can be used to recover from transient faults. We introduce a new linguistic abstraction called stabilizers that permits the specification of per-thread monitors and the restoration of globally consistent checkpoints. Global states are computed through lightweight monitoring of communication events among threads (e.g. message-passing operations or updates to shared variables). Our checkpointing abstraction provides atomicity and isolation guarantees during state restoration ensuring restored global states are safe.Our experimental results on several realistic, multithreaded, server-style CML applications, including a web server and a windowing toolkit, show that the overheads to use stabilizers are small, and lead us to conclude that they are a viable mechanism for defining safe checkpoints in concurrent functional programs. Our experiments conclude with a case study illustrating how to build open nested transactions from our checkpointing mechanism

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research