
K-Means Cluster Based Oversampling Algorithm for Imbalanced Data Classification
Author(s) -
Ms. S. Santha Subbulaxmi*,
G. Arumugam
Publication year - 2020
Publication title -
international journal of recent technology and engineering
Language(s) - English
Resource type - Journals
ISSN - 2277-3878
DOI - 10.35940/ijrte.e6535.018520
Subject(s) - oversampling , computer science , data mining , statistical classification , data classification , cluster (spacecraft) , artificial intelligence , machine learning , field (mathematics) , algorithm , pattern recognition (psychology) , mathematics , computer network , bandwidth (computing) , pure mathematics , programming language
Imbalanced data classification problems endeavor to find a dependent variable in a skewed data distribution. Imbalanced data classification problems present in many application areas like, medical disease diagnosis, risk management, fault-detection, etc. It is a challenging problem in the field of machine learning and data mining. In this paper, K-Means cluster based oversampling algorithm is proposed to solve the imbalanced data classification problem. The experimental results show that the proposed algorithm outperforms the existing oversampling algorithms of previous studies.