z-logo
open-access-imgOpen Access
On Evaluation of Data Stream Clustering Algorithms: A Survey
Author(s) -
Christian Nordahl,
Veselka Boeva,
Hakan Grahn,
Marie Persson-Netz
Publication year - 2025
Publication title -
ieee access
Language(s) - English
Resource type - Magazines
SCImago Journal Rank - 0.587
H-Index - 127
eISSN - 2169-3536
DOI - 10.1109/access.2025.3596435
Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation
Data stream mining is a research area that has grown enormously in recent years. The main challenge is to extract knowledge in real-time from a possibly unbounded stream of data. Clustering, a process in which groupings within the data are to be identified, data streams is an useful technique to extract and identify underlying structures of the data. An open question in the field of stream clustering is how to evaluate the proposed algorithms. In this survey, we review the literature in the domain to identify the common methodologies, datasets, and evaluation measures, used to evaluate the algorithms. We provide a short summary of the stream clustering algorithms in the literature, but our primary focus lies in the survey of cluster validation relevant to the evaluation of data stream clustering algorithms. We begin our literature review with the inception of clustering incrementally, namely with the introduction of the balanced iterative reducing and clustering using hierarchies (BIRCH) algorithm.We identify that the evaluation methodologies primarily focus on performance, and that aspects such as cluster quality are rarely visited. Performance has been the focal point of all evaluation, both in terms of computational performance and accuracy, since the inception of clustering data streams. We also identify that issues that exist in the conventional clustering domain are also present in the data stream clustering. However, minor additions to the evaluation methods can improve both the applicability and usefulness of the algorithms.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom