z-logo
Premium
A tree‐based model for homogeneous groupings of multinomials
Author(s) -
Yang Tae Young
Publication year - 2005
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.2182
Subject(s) - tree (set theory) , homogeneous , pruning , sequence (biology) , computer science , multinomial distribution , binary tree , rank (graph theory) , bayesian probability , heuristic , mathematics , wilcoxon signed rank test , k ary tree , tree structure , statistics , artificial intelligence , algorithm , combinatorics , genetics , agronomy , biology , mann–whitney u test
The motivation of this paper is to provide a tree‐based method for grouping multinomial data according to their classification probability vectors. We produce an initial tree by binary recursive partitioning whereby multinomials are successively split into two subsets and the splits are determined by maximizing the likelihood function. If the number of multinomials k is too large, we propose to order the multinomials, and then build the initial tree based on a dramatically smaller number k –1 of possible splits. The tree is then pruned from the bottom up. The pruning process involves a sequence of hypothesis tests of a single homogeneous group against the alternative that there are two distinct, internally homogeneous groups. As pruning criteria, the Bayesian information criterion and the Wilcoxon rank‐sum test are proposed. The tree‐based model is illustrated on genetic sequence data. Homogeneous groupings of genetic sequences present new opportunities to understand and align these sequences. Copyright © 2005 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here