Premium
Causal inference based fault localization for numerical software with NUMFL
Author(s) -
Bai Zhuofu,
Shu Gang,
Podgurski Andy
Publication year - 2017
Publication title -
software testing, verification and reliability
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.216
H-Index - 49
eISSN - 1099-1689
pISSN - 0960-0833
DOI - 10.1002/stvr.1613
Subject(s) - covariate , causal inference , confounding , inference , propensity score matching , statistics , computer science , statistical inference , fault (geology) , expression (computer science) , econometrics , data mining , mathematics , artificial intelligence , seismology , programming language , geology
Summary This paper presents NUMFL, a value‐based causal inference technique for localizing faults in numerical software. NUMFL combines causal and statistical analyses to characterize the causal effects of individual numerical expressions on output errors. Given value‐profiles for an expression's variables, NUMFL uses generalized propensity scores or covariate balancing propensity scores to reduce confounding bias caused by evaluation of other, faulty expressions. It estimates the average failure‐causing effect of an expression using statistical regression models fit within generalized propensity score or covariate balancing propensity score subclasses (strata). This paper also reports on an empirical evaluation of NUMFL involving components from four Java numerical libraries, in which it was compared with five alternative statistical fault localization metrics. The results indicate that NUMFL is more effective than competitive statistical fault localization techniques. The results also indicate NUMFL that works surprisingly well with data from failing runs alone. Copyright © 2016 John Wiley & Sons, Ltd.