z-logo
Premium
Avoiding the Look‐Ahead Pathology of Decision Tree Learning
Author(s) -
Last Mark,
Roizman Michael
Publication year - 2013
Publication title -
international journal of intelligent systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.291
H-Index - 87
eISSN - 1098-111X
pISSN - 0884-8173
DOI - 10.1002/int.21612
Subject(s) - decision tree , computer science , tree (set theory) , artificial intelligence , machine learning , mathematics , mathematical analysis
Most decision‐tree induction algorithms are using a local greedy strategy, where a leaf is always split on the best attribute according to a given attribute‐selection criterion. A more accurate model could possibly be found by looking ahead for alternative subtrees. However, some researchers argue that the look‐ahead should not be used due to a negative effect (called “decision‐tree pathology”) on the decision‐tree accuracy. This paper presents a new look‐ahead heuristics for decision‐tree induction. The proposed method is called look‐ahead J48 ( LA‐J48) as it is based on J48, the Weka implementation of the popular C4.5 algorithm. At each tree node, the LA‐J48 algorithm applies the look‐ahead procedure of bounded depth only to attributes that are not statistically distinguishable from the best attribute chosen by the greedy approach of C4.5. A bootstrap process is used for estimating the standard deviation of splitting criteria with unknown probability distribution. Based on a separate validation set, the attribute producing the most accurate subtree is chosen for the next step of the algorithm. In experiments on 20 benchmark data sets, the proposed look‐ahead method outperforms the greedy J48 algorithm with the gain ratio and the gini index splitting criteria, thus avoiding the look‐ahead pathology of decision‐tree induction.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here