Premium
An implementation of using remote memory to checkpoint processes
Author(s) -
Hsu ShangTe,
Chang RueiChuan
Publication year - 1999
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/(sici)1097-024x(199909)29:11<985::aid-spe269>3.0.co;2-1
Subject(s) - computer science , operating system , exploit , overhead (engineering) , hard disk drive performance characteristics , latency (audio) , process (computing) , workstation , parallel computing , embedded system , computer security , telecommunications
Process checkpointing is a procedure which periodically saves the process states into stable storage. Most checkpointing facilities select hard disks for archiving. However, the disk seek time is limited by the speed of the read‐write heads, thus checkpointing process into a local disk requires extensive disk bandwidth. In this paper, we propose an approach that exploits the memory on idle workstations as a faster storage for checkpointing. In our scheme, autonomous machines which submit jobs to the computation server offer their physical memory to the server for job checkpointing. Eight applications are used to measure the remote memory performance in four checkpointing policies. Experimental results show that remote memory reduces at least 34.5 per cent of the overhead for sequential checkpointing and 32.1 per cent for incremental checkpointing. Additionally, to checkpoint a running process into a remote memory requires only 60 per cent of the local disk checkpoint latency time. Copyright © 1999 John Wiley & Sons, Ltd.