Premium
Parallel debugging: An investigative study
Author(s) -
Zakari Abubakar,
Lee Sai Peck
Publication year - 2019
Publication title -
journal of software: evolution and process
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.371
H-Index - 29
eISSN - 2047-7481
pISSN - 2047-7473
DOI - 10.1002/smr.2178
Subject(s) - debugging , jaccard index , computer science , euclidean distance , cluster analysis , similarity (geometry) , metric (unit) , algorithmic program debugging , hamming distance , data mining , software , parallel computing , algorithm , artificial intelligence , programming language , engineering , operations management , image (mathematics)
Abstract In the simultaneous localization of multiple software faults, a parallel debugging approach has consistently been utilized. The effectiveness of a parallel debugging approach is critically determined by the type of clustering algorithm and the distance metric used. However, clustering algorithms that group failed tests based on their execution profile similarity with distance metrics such as Euclidean distance, Jaccard distance, and Hamming distance are considered to be problematic and not appropriate. In this paper, we conducted an investigative study on the usefulness of the problematic parallel debugging approach that makes use of k ‐means clustering algorithm (that groups failed tests based on their execution profile similarity) with Euclidian distance metric on three similarity coefficient‐based fault localization techniques in terms of localization effectiveness. Secondly, we compare the effectiveness of the problematic parallel debugging approach with one‐bug‐at‐a‐time debugging approach (OBA) and a state‐of‐the‐art parallel debugging approach named MSeer. The empirical evaluation is conducted on 540 multiple‐fault versions of eight medium‐sized to large‐sized subject programs with two, three, four, and five faulty versions. Our results suggest that clustering failed tests based on their execution profile similarity and the utilization of distance metrics such as Euclidean distance is indeed problematic and contributes to the reduction of effectiveness in localizing multiple faults.