Software defect prediction with imbalanced distribution by radius‐synthetic minority over‐sampling technique | Zendy

Guo Shikai | Zendy; Dong Jian | Zendy; Li Hui | Zendy; Wang Jiahui | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Software defect prediction with imbalanced distribution by radius‐synthetic minority over‐sampling technique

Author(s) -

Guo Shikai,

Dong Jian,

Li Hui,

Wang Jiahui

Publication year - 2021

Publication title -

journal of software: evolution and process

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.371

H-Index - 29

eISSN - 2047-7481

pISSN - 2047-7473

DOI - 10.1002/smr.2362

Subject(s) - computer science , software , sampling (signal processing) , software quality , task (project management) , software bug , software metric , class (philosophy) , data mining , machine learning , scope (computer science) , artificial intelligence , reliability engineering , software development , engineering , systems engineering , computer vision , filter (signal processing) , programming language

Software defect prediction, which can identify the defect‐prone modules, is an effective technology to ensure the quality of software products. Due to the importance in software maintenance, many learning‐based software defect prediction models are presented in recent years. Actually, the defects usually occupy a very small proportions in software source codes; thus, the imbalanced distributions between defect‐prone modules and non‐defect‐prone modules increase the learning difficulty of the classification task. To address this issue, we present a random over‐sampling mechanism used to generate minority‐class samples from high‐dimensional sampling space to deal with the imbalanced distributions in software defect prediction, in which two constraints are applied to provide a robust way to generate new synthetic samples, that is, scaling the random over‐sampling scope to a reasonable area and distinguishing the majority‐class samples in a critical region. Based on nine open datasets of software projects, we experimentally verify that our presented method is effective on predict the defect‐prone modules, and the effect is superior to the traditional imbalanced processing methods.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research