z-logo
Premium
Detecting duplicate bug reports with software engineering domain knowledge
Author(s) -
Aggarwal Karan,
Timbers Finbarr,
Rutgers Tanner,
Hindle Abram,
Stroulia Eleni,
Greiner Russell
Publication year - 2017
Publication title -
journal of software: evolution and process
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.371
H-Index - 29
eISSN - 2047-7481
pISSN - 2047-7473
DOI - 10.1002/smr.1821
Subject(s) - computer science , domain (mathematical analysis) , software engineering , domain knowledge , context (archaeology) , data science , software , domain engineering , software mining , information retrieval , data mining , software development , software construction , mathematical analysis , paleontology , mathematics , biology , programming language
Bug deduplication, ie, recognizing bug reports that refer to the same problem, is a challenging task in the software‐engineering life cycle. Researchers have proposed several methods primarily relying on information‐retrieval techniques. Our work motivated by the intuition that domain knowledge can provide the relevant context to enhance effectiveness, attempts to improve the use of information retrieval by augmenting with software‐engineering knowledge. In our previous work, we proposed the software‐literature‐context method for using software‐engineering literature as a source of contextual information to detect duplicates. If bug reports relate to similar subjects, they have a better chance of being duplicates. Our method, being largely automated, has a potential to substantially decrease the level of manual effort involved in conventional techniques with a minor trade‐off in accuracy. In this study, we extend our work by demonstrating that domain‐specific features can be applied across projects than project‐specific features demonstrated previously while still maintaining performance. We also introduce a hierarchy‐of‐context to capture the software‐engineering knowledge in the realms of contextual space to produce performance gains. We also highlight the importance of domain‐specific contextual features through cross‐domain contexts: adding context improved accuracy; Kappa scores improved by at least 3.8% to 10.8% per project.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here