z-logo
Premium
A runtime fault survival method for deployed software during production runs
Author(s) -
Seo Jooyoung,
Park Jihyun,
Choi Byoungju
Publication year - 2016
Publication title -
journal of software: evolution and process
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.371
H-Index - 29
eISSN - 2047-7481
pISSN - 2047-7473
DOI - 10.1002/smr.1767
Subject(s) - debugging , computer science , fault coverage , fault injection , fault (geology) , embedded system , software fault tolerance , software , reliability engineering , reliability (semiconductor) , state (computer science) , real time computing , distributed computing , operating system , engineering , algorithm , power (physics) , physics , electrical engineering , quantum mechanics , seismology , geology , electronic circuit
Abstract Runtime memory faults during production run should be more thoroughly addressed because they severely affect system availability. This paper proposes a method for mitigating memory faults during production runs of deployed software, thereby ensuring normal system operation until patches to fix the faults are delivered. Furthermore, the method helps enhance debugging efficiency by providing accurate on‐site fault information used by developers to release timely patches. The core of the method is to offer information tagging to identify runtime faults and a fault survival algorithm to provide differentiated fault mitigation according to the runtime state. We implemented ROPHE on a Linux 2.6 platform and conducted an empirical study of representative Linux applications. The results show that the average fault‐handling rate among the applications is 35.75%, whereas the RemOte runtime Protection for High‐risk Error (ROPHE) greatly improves capacity to an average of 91.94%. Specifically, the fault‐handling rates of the applications ranged widely from 7.32% to 62.96%, while ROPHE provided fault‐survival rates in the relatively narrow range of 82.35–97.44%. The experimental results show that the proposed method guarantees the same level of reliability for all applications regardless of their individual fault handling capacity. Copyright © 2016 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here