
Classification Algorithm in Data Mining Based on Maximum Exponential Class Counts Technique
Author(s) -
D. Mabuni*
Publication year - 2020
Publication title -
international journal of innovative technology and exploring engineering
Language(s) - English
Resource type - Journals
ISSN - 2278-3075
DOI - 10.35940/ijitee.h6274.069820
Subject(s) - categorical variable , decision tree , measure (data warehouse) , node (physics) , tree (set theory) , information gain ratio , class (philosophy) , data mining , computer science , mathematics , id3 algorithm , decision tree learning , exponential function , incremental decision tree , artificial intelligence , information gain , algorithm , pattern recognition (psychology) , statistics , machine learning , combinatorics , mathematical analysis , structural engineering , engineering
A new split attribute measure for decision tree node split during decision tree creation is proposed. The new split measure consists of the sum of class counts of distinct values of categorical attributes in the dataset. Larger counts induce larger partitions and smaller trees there by favors to the determination of the best spit attribute. The new split attribute measure is termed as maximum exponential class counts (MECC). Experiment results obtained over several UCI machine learning categorical datasets predominantly indicate that the decision tree models created based on the proposed MECC node split attribute technique provides better classification accuracy results and smaller trees in size than the decision trees created using popular gain ratio, normalized gain ratio and gini-index measures. The experimental results are mainly focused on performing and analyzing the results from the node splitting measures alone.