Detecting duplicate bug reports with software engineering domain knowledge | Zendy

Aggarwal Karan | Zendy; Timbers Finbarr | Zendy; Rutgers Tanner | Zendy; Hindle Abram | Zendy; Stroulia Eleni | Zendy; Greiner Russell | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Detecting duplicate bug reports with software engineering domain knowledge

Author(s) -

Aggarwal Karan,

Timbers Finbarr,

Rutgers Tanner,

Hindle Abram,

Stroulia Eleni,

Greiner Russell

Publication year - 2017

Publication title -

journal of software: evolution and process

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.371

H-Index - 29

eISSN - 2047-7481

pISSN - 2047-7473

DOI - 10.1002/smr.1821

Subject(s) - computer science , domain (mathematical analysis) , software engineering , domain knowledge , context (archaeology) , data science , software , domain engineering , software mining , information retrieval , data mining , software development , software construction , mathematical analysis , paleontology , mathematics , biology , programming language

Bug deduplication, ie, recognizing bug reports that refer to the same problem, is a challenging task in the software‐engineering life cycle. Researchers have proposed several methods primarily relying on information‐retrieval techniques. Our work motivated by the intuition that domain knowledge can provide the relevant context to enhance effectiveness, attempts to improve the use of information retrieval by augmenting with software‐engineering knowledge. In our previous work, we proposed the software‐literature‐context method for using software‐engineering literature as a source of contextual information to detect duplicates. If bug reports relate to similar subjects, they have a better chance of being duplicates. Our method, being largely automated, has a potential to substantially decrease the level of manual effort involved in conventional techniques with a minor trade‐off in accuracy. In this study, we extend our work by demonstrating that domain‐specific features can be applied across projects than project‐specific features demonstrated previously while still maintaining performance. We also introduce a hierarchy‐of‐context to capture the software‐engineering knowledge in the realms of contextual space to produce performance gains. We also highlight the importance of domain‐specific contextual features through cross‐domain contexts: adding context improved accuracy; Kappa scores improved by at least 3.8% to 10.8% per project.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research