GEP-based classifier with drift detection for mining imbalanced data streams
Author(s) -
Joanna Jędrzejowicz,
Piotr Jędrzejowicz
Publication year - 2020
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2020.08.005
Subject(s) - computer science , concept drift , data stream mining , classifier (uml) , data stream , data mining , reuse , gene expression programming , decision tree , artificial intelligence , machine learning , telecommunications , ecology , biology
Mining data streams require to cope with time, data size and possible concept drift constraints. Even more challenging is the case where, apart from the above, one has to deal with imbalanced data. Mining non stationary and imbalanced data streams is a relatively new area of research. In this paper, we propose the Gene Expression Programming (GEP) classifier with drift detection and data reuse for mining imbalanced data streams. GEP is used to evolve a complex expression tree returning predictions. Drift detector role is to signal the occurrence of drift which triggers inducing a new learner. Data reuse mechanism allows for improving the balance between minority and majority instances in a subset of data used for evolving the learner. The proposed approach is validated experimentally. The experiment results confirm that our classifier produces high-quality predictions.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom