z-logo
Premium
Software change‐proneness prediction through combination of bagging and resampling methods
Author(s) -
Zhu Xiaoyan,
He Yueyang,
Cheng Long,
Jia Xiaolin,
Zhu Lei
Publication year - 2018
Publication title -
journal of software: evolution and process
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.371
H-Index - 29
eISSN - 2047-7481
pISSN - 2047-7473
DOI - 10.1002/smr.2111
Subject(s) - resampling , computer science , machine learning , naive bayes classifier , artificial intelligence , data mining , c4.5 algorithm , undersampling , software , support vector machine , classifier (uml) , programming language
Identifying the change‐prone parts of software could help managers and developers to effectively allocate maintenance resource and time during early phases of software life cycle. Change‐proneness prediction on file level with binary classification methods makes such identification possible. As the fact that change‐prone files frequently account for a small part of all the files, the prediction performance of standard classification methods is not satisfying. In this paper, we employ imbalanced learning methods, including bagging, resampling, and especially their combination to reduce the performance decrease of standard classifiers caused by the class imbalance problem in change‐proneness prediction. Besides, we propose a boxplot‐based partition method to provide more proper change‐proneness label designation for the training data. Eight open‐source Java projects are chosen in the empirical study to validate the effectiveness of the combination methods in change‐proneness prediction. The experimental results of the empirical study show that combining bagging with resampling can significantly improve the prediction performance of only bagging or resampling. Of all the combination methods employed, combination of bagging with undersampling performs better than others. And support vector machine is more effective as a base classifier than J48 and naive Bayes.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here