z-logo
open-access-imgOpen Access
Divergence decision tree classification with Kolmogorov kernel smoothing in high energy physics
Author(s) -
Václav Kůs,
Kristina Jaruskova
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1730/1/012060
Subject(s) - smoothing , divergence (linguistics) , metric (unit) , kernel density estimation , artificial intelligence , decision tree , computer science , kernel (algebra) , binary classification , pattern recognition (psychology) , machine learning , mathematics , data mining , statistics , support vector machine , discrete mathematics , philosophy , linguistics , operations management , estimator , economics
The binary classification of a given dataset is a task of assigning one of the two possible classes to each observation. This can be achieved by many machine learning techniques, e.g. logistic regression, decision trees, neural networks. The supervised divergence decision tree (SDDT) is our own binary classification algorithm in favour of the Rényi divergence, which incorporates multi-dimensional kernel density estimates (KDEs) as the main part of the splitting process in its tree nodes. However, the KDE needs an efficient smoothing in order to obtain quite satisfactory classification results. In this paper, the D-discrepancy method for selecting the bandwidth was applied. It is based on an evaluation of divergences, or distances, between two estimated distributions. The Kolmogorov metric distance on probability space is used and the performance of such a novel technique is compared to standard smoothing techniques. The final goal is to perform a binary classification and achieve the best possible results with respect to the AUC value (area under ROC curve) on a given high energy physics (HEP) dataset, specifically for d+Au heavy ions decay data. This HEP dataset is described and the main structure of the used SDDT is outlined. Final classification results are presented for KDE under Kolmogorov D-method of smoothing in SDDT algorithm.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here