Premium
Experimenting with prequential variations for data stream learning evaluation
Author(s) -
Hidalgo Juan I. González,
Maciel Bruno I. F.,
Barros Roberto S. M.
Publication year - 2019
Publication title -
computational intelligence
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.353
H-Index - 52
eISSN - 1467-8640
pISSN - 0824-7935
DOI - 10.1111/coin.12208
Subject(s) - computer science , sliding window protocol , concept drift , data mining , inference , data stream , process (computing) , data stream mining , machine learning , artificial intelligence , window (computing) , telecommunications , operating system
Processing data streams requires new demands not existent on static environments. In online learning, the probability distribution of the data can often change over time (concept drift). The prequential assessment methodology is commonly used to evaluate the performance of classifiers in data streams with stationary and non‐stationary distributions. It is based on the premise that the purpose of statistical inference is to make sequential probability forecasts for future observations, rather than to express information about the past accuracy achieved. This article empirically evaluates the prequential methodology considering its three common strategies used to update the prediction model, namely, Basic Window, Sliding Window, and Fading Factors. Specifically, it aims to identify which of these variations is the most accurate for the experimental evaluation of the past results in scenarios where concept drifts occur, with greater interest in the accuracy observed within the total data flow. The prequential accuracy of the three variations and the real accuracy obtained in the learning process of each dataset are the basis for this evaluation. The results of the carried‐out experiments suggest that the use of Prequential with the Sliding Window variation is the best alternative.