A Survey on Methods for Solving Data Imbalance Problem for Classification
Author(s) -
Arpit Singh,
Anuradha Purohit
Publication year - 2015
Publication title -
international journal of computer applications
Language(s) - English
Resource type - Journals
ISSN - 0975-8887
DOI - 10.5120/ijca2015906677
Subject(s) - computer science , data science , information retrieval , data mining
term "data imbalance" in classification is a well established phenomenon in which data set contains unbalanced class distributions. Dataset is called unbalanced if it contains at least one class which is presented by very few examples. A range of solutions have been proposed for the problem of data imbalance including data sampling, cost evaluation of model, bagging, boosting, Genetic Programming (GP) based methods etc. This paper presents a survey of various methods introduced by researchers to handle data imbalance problem in order to improve classification performance and further the comparison between the methods on the basis of their advantages and disadvantages is done.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom