z-logo
Premium
Hardware implementation of fault‐tolerance in dual computer systems
Author(s) -
Samet Refik
Publication year - 2009
Publication title -
quality and reliability engineering international
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.913
H-Index - 62
eISSN - 1099-1638
pISSN - 0748-8017
DOI - 10.1002/qre.1018
Subject(s) - fault tolerance , fault (geology) , computer science , software fault tolerance , dual (grammatical number) , fault coverage , fault model , reliability (semiconductor) , embedded system , stuck at fault , sequence (biology) , key (lock) , point (geometry) , fault injection , reliability engineering , real time computing , distributed computing , fault detection and isolation , engineering , software , operating system , artificial intelligence , art , mathematics , actuator , literature , biology , genetics , power (physics) , geometry , quantum mechanics , electronic circuit , physics , seismology , electrical engineering , geology
In this paper, we propose an architectural design for a dual computer system (DCS) that operates in real‐time with the fault‐tolerance implemented purely by hardware. We have a novel design allowing the implementation of hardware that performs the following key services: the determination of fault type (temporary or permanent) and the localization of the faulty computer without using self‐testing techniques and diagnosis routines. We also propose a non‐trivial sequence of services for fault‐tolerance in which the determination of the fault type and the recovery of computational processes after a temporary fault are realized before fault localization. Our design has several benefits: the designed hardware shortens the recovery point time period; the proposed non‐trivial sequence of fault‐tolerant services reduces (to two) the number of logical segments that should be re‐run to recover the computational processes; and the determination of the fault type allows eliminating only the computer with a permanent fault. These contributions bring both an increase in system performance and an increase in the degree of system reliability. Copyright © 2009 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom