Premium
Adding fault‐tolerant transaction processing to LINDA
Author(s) -
Can Scott R.,
Dunn David
Publication year - 1994
Publication title -
software: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.437
H-Index - 70
eISSN - 1097-024X
pISSN - 0038-0644
DOI - 10.1002/spe.4380240503
Subject(s) - computer science , tuple space , tuple , transaction processing system , overhead (engineering) , process (computing) , programmer , fault tolerance , unix , distributed computing , node (physics) , parallel computing , transaction processing , operating system , database transaction , software , programming language , mathematics , discrete mathematics , structural engineering , engineering
To simplify the difficult task of writing fault‐tolerant parallel software, we implemented extensions to the basic functionality of the LINDA or tuple‐space programming model. Our approach implements a mechanism of transaction processing to ensure that tuples are properly handled in the event of a node or communications failure. If a process retrieving a tuple fails to complete processing or a tuple posting or retrieval message is lost, the system is automatically rolled back to a previous stable state. Processing failures and lost messages are detected by time‐out alarms. Roll‐back is accomplished by reposting pertinent tuples. Intermediate tuples produced during partial processing are not committed or made available until a process completes. In the absence of faults, system overhead is low. The fault‐tolerance mechanism is implemented at the system level and requires little programmer effort or expertise. Two implementations of the model are discussed, one using a UNIX network of workstations and one using a Transputer network. Data measuring model overhead and some aspects of system performance in the presence of faults is presented for an example system.