z-logo
open-access-imgOpen Access
Algorithms and methods of data clustering in the analysis of information security event logs
Author(s) -
Dia. Sidorova,
Evgeniy N. Pivkin
Publication year - 2022
Publication title -
bezopasnostʹ cifrovyh tehnologij
Language(s) - English
Resource type - Journals
ISSN - 2782-2230
DOI - 10.17212/2782-2230-2022-1-41-60
Subject(s) - cluster analysis , computer science , data mining , overfitting , event (particle physics) , partition (number theory) , algorithm , mathematics , machine learning , artificial neural network , physics , quantum mechanics , combinatorics
Security event log files give an idea of the state of the information system and allow you to find anomalies in user behavior and cybersecurity incidents. The existing event logs (application, system, security event logs) and their division into certain types are considered. But automated analysis of security event log data is difficult because it contains a large amount of unstructured data that has been collected from various sources. Therefore, this article presents and describes the problem of analyzing information security event logs. And to solve this problem, new and not particularly studied methods and algorithms for data clustering were considered, such as Random forest (random forest), incremental clustering, IPLoM algorithm (Iterative Partitioning Log Mining - iterative analysis of the partitioning log). The Random forest algorithm creates decision trees for data samples, after which it is provided with a forecast for each sample, and the best solution is selected by voting. This method reduces overfitting by averaging the scores. The algorithm is also used in such types of problems as regression and classification. Incremental clustering defines clusters as groups of objects that belong to the same class or concept, which is a specific set of pairs. When clusters are defined, they can overlap, allowing for a degree of "fuzziness for samples" that lie at the boundaries of different clusters. The IPLoM algorithm uses the unique characteristics of log messages to iteratively partition the log, which helps to extract message types efficiently.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here