The design and implementation of Berkeley Lab's linuxcheckpoint/restart
Author(s) -
Jason Duell
Publication year - 2005
Language(s) - English
Resource type - Reports
DOI - 10.2172/891617
Subject(s) - operating system , computer science , linux kernel , scheduling (production processes) , interface (matter) , kernel (algebra) , component (thermodynamics) , system call , parallel computing , embedded system , engineering , operations management , physics , mathematics , bubble , combinatorics , maximum bubble pressure method , thermodynamics
This paper describes Berkeley Linux Checkpoint/Restart (BLCR), a linux kernel module that allows system-level checkpoints on a variety of Linux systems. BLCR can be used either as a stand alone system for checkpointing applications on a single machine, or as a component by a scheduling system or parallel communication library for checkpointing and restoring parallel jobs running on multiple machines. Integration with Message Passing Interface (MPI) and other parallel systems is described
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom