
Estimation of general complexity of the procedure for constructing a binary logical classification tree for an arbitrary case
Author(s) -
Igor Povkhan
Publication year - 2020
Publication title -
telekomunìkacìjnì ta ìnformacìjnì tehnologìï
Language(s) - English
Resource type - Journals
ISSN - 2412-4338
DOI - 10.31673/2412-4338.2020.021100
Subject(s) - binary tree , tree (set theory) , set (abstract data type) , binary classification , binary number , computer science , feature (linguistics) , mathematics , logical conjunction , range (aeronautics) , artificial intelligence , tree structure , binary decision diagram , sample (material) , pattern recognition (psychology) , data mining , algorithm , theoretical computer science , support vector machine , mathematical analysis , linguistics , philosophy , materials science , arithmetic , composite material , programming language , chemistry , chromatography
We propose an upper estimate of the complexity of the binary logical tree synthesis procedure for classifying an arbitrary case (for conditions of weak and strong separation of classes in the training sample). The solution to this question is of a fundamental nature, regarding the assessment of the structural complexity of classification models (in the form of tree structures) of discrete objects for a wide range of applied classification and recognition problems in terms of developing promising schemes and methods for their final optimization (minimization) of the structure. This research is relevant not only for the constructions of logical classification trees, but also allows us to extend the complexity estimation scheme itself to the general case of algorithmic structures of classification trees (concepts of algorithm trees and generalized feature trees). The current issue of complexity of the general procedure for constructing a logical classification tree based on the concept of step-by-step selection of sets of elementary features (their possible heterogeneous sets and combinations), which for a given initial training sample (an array of discrete information) builds a tree structure (classification model), from a set of elementary features (basic attributes) evaluated at each stage of the model construction scheme for this sample. Thus, modern information technologies based on mathematical models of pattern recognition (logical and algorithmic classification trees) are widely used in socio-economic, environmental and other systems of primary analysis and processing of large amounts of information. This is due to the fact that this approach allows you to eliminate a set of existing disadvantages of well-known classical methods and schemes and achieve a fundamentally new result. The work is devoted to the problems of classification tree models (decision trees), and offers an assessment of the complexity of logical tree structures (classification tree models), which consist of selected and ranked sets of elementary features built on the basis of the General concept of branched feature selection. This method, when forming the current vertex of the logical tree (node), provides the selection of the most informative (qualitative) elementary features from the source set. This approach allows you to significantly reduce the size and complexity of the tree (the total number of branches and tiers of the structure) and improve the quality of its subsequent analysis.