z-logo
open-access-imgOpen Access
Improved students’ performance prediction for multi-class imbalanced problems using hybrid and ensemble approach in educational data mining
Author(s) -
Hasniza Hassan,
Nor Bahiah Ahmad,
Syahid Anuar
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1529/5/052041
Subject(s) - undersampling , oversampling , computer science , machine learning , artificial intelligence , adaboost , benchmark (surveying) , ensemble learning , class (philosophy) , data mining , sampling (signal processing) , binary classification , support vector machine , filter (signal processing) , computer network , geodesy , bandwidth (computing) , computer vision , geography
Among the problems raised in the data mining area, the class imbalance is a well-known issue that always occurs. Many researchers studied this issue in several fields using three commonly used techniques: sampling, ensemble, or cost-sensitive learning. However, such studies are still new in education domains. This problem always related to the quality of data that gives the most impact to form an accurate prediction result. Many previous studies focus on binary imbalance classification problems instead of the multi-class imbalance problem in education data. This study used 4413 student instances of two datasets; students’ information system and e-learning from the Faculty of Engineering in a Malaysia university for First Semester 2017/2018. Three sampling categories utilized in this study are oversampling techniques, undersampling techniques, and hybrid techniques. The research empirically analyzes five types of ensemble classifiers and seven sampling techniques. The experimental results show a hybrid technique ROS with AdaBoost produces the most excellent performance compared to the other benchmark techniques. SMOTEENN technique with ensembles classifiers consistently produces high results. This technique has great potential in improving the students’ performance prediction model.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here