Open Access
Vector space models for trace clustering: a comparative study
Author(s) -
Mateus Alex dos Santos Luna,
André Paulino Lima,
Thaís Rodrigues Neubauer,
Marcelo Fantinato,
Sarajane Marques Peres
Publication year - 2021
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5753/eniac.2021.18274
Subject(s) - cluster analysis , trace (psycholinguistics) , computer science , data mining , event (particle physics) , process mining , hierarchical clustering , process (computing) , correlation clustering , artificial intelligence , business process , machine learning , business process management , work in process , engineering , philosophy , linguistics , physics , operations management , quantum mechanics , operating system
Process mining explores event logs to offer valuable insights to business process managers. Some types of business processes are hard to mine, including unstructured and knowledge-intensive processes. Then, trace clustering is usually applied to event logs aiming to break it into sublogs, making it more amenable to the typical process mining task. However, applying clustering algorithms involves decisions, such as how traces are represented, that can lead to better results. In this paper, we compare four vector space models for trace clustering, using them with an agglomerative clustering algorithm in synthetic and real-world event logs. Our analyses suggest the embeddings-based vector space model can properly handle trace clustering in unstructured processes.