Premium
Minority oversampling based on the attraction‐repulsion Weber problem
Author(s) -
Fiore Ugo
Publication year - 2020
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.5601
Subject(s) - oversampling , generalization , preprocessor , class (philosophy) , computer science , artificial intelligence , attraction , machine learning , synthetic data , power (physics) , pattern recognition (psychology) , mathematics , bandwidth (computing) , linguistics , philosophy , physics , mathematical analysis , computer network , quantum mechanics
Summary Learning on imbalanced datasets, where one class is underrepresented, is problematic and important at the same time. On the one hand, a limited number of positive examples restricts the generalization ability of classifiers. On the other hand, often, the class of interest is such exactly because it is rare. The Synthetic Minority Oversampling TEchnique (SMOTE) is a preprocessing method that creates new synthetic examples by interpolating between neighboring instances. In this work, an enhancement to SMOTE is proposed, which characterizes synthetic instances as solutions of attraction‐repulsion problems among the neighboring data points. Experimental evaluation shows an improvement in the positive predictive power of classification.