Premium
PERFORMANCE OF A LOCALIZED TREE SPLITTING CRITERION IN TREE AVERAGING
Author(s) -
Bremner Alexandra P.,
Taplin Ross H.
Publication year - 2004
Publication title -
australian and new zealand journal of statistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.434
H-Index - 41
eISSN - 1467-842X
pISSN - 1369-1473
DOI - 10.1111/j.1467-842x.2004.00355.x
Subject(s) - mathematics , categorical variable , tree (set theory) , bayesian information criterion , statistics , combinatorics
Summary This paper explores the performance of the local splitting criterion devised by Bremner & Taplin for classification and regression trees when multiple trees are averaged to improve performance. The criterion is compared with the deviance used by Clark & Pregibon's method, which is a global splitting criterion typically used to grow trees. The paper considers multiple trees generated by randomly selecting splits with probability proportional to the likelihood for the split, and by bagging where bootstrap samples from the data are used to grow trees. The superiority of the localized splitting criterion often persists when multiple trees are grown and averaged for six datasets. Tree averaging is known to be advantageous when the trees being averaged produce different predictions, and this can be achieved by choosing splits where the splitting criterion is locally optimal. The paper shows that use of locally optimal splits gives promising results in conjunction with both local and global splitting criteria, and with and without random selection of splits. The paper also extends the local splitting criterion to accommodate categorical predictors.