
K Means Cluster Based Undersampling Ensemble for Imbalanced Data Classification
Author(s) -
S. Santha Subbulaxmi,
G. Arumugam
Publication year - 2020
Publication title -
international journal of engineering and advanced technology
Language(s) - English
Resource type - Journals
ISSN - 2249-8958
DOI - 10.35940/ijeat.c5188.029320
Subject(s) - undersampling , boosting (machine learning) , computer science , statistical classification , data mining , data classification , ensemble learning , artificial intelligence , machine learning , cluster (spacecraft) , cluster analysis , oversampling , pattern recognition (psychology) , programming language , computer network , bandwidth (computing)
Imbalanced data classification is a critical and challenging problem in both data mining and machine learning. Imbalanced data classification problems present in many application areas like rare medical diagnosis, risk management, fault-detection, etc. The traditional classification algorithms yield poor results in imbalanced classification problems. In this paper, K-Means cluster based undersampling ensemble algorithm is proposed to solve the imbalanced data classification problem. The proposed method combines K-Means cluster based undersampling and boosting method. The experimental results show that the proposed algorithm outperforms the other sampling ensemble algorithms of previous studies.