Premium
Maximally Selected Chi‐Square Statistics and Binary Splits of Nominal Variables
Author(s) -
Boulesteix AnneLaure
Publication year - 2006
Publication title -
biometrical journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.108
H-Index - 63
eISSN - 1521-4036
pISSN - 0323-3847
DOI - 10.1002/bimj.200510191
Subject(s) - mathematics , statistics , chi square test , binary number , square (algebra) , statistic , pearson's chi squared test , variable (mathematics) , test statistic , statistical hypothesis testing , combinatorics , arithmetic , mathematical analysis , geometry
We address the problem of maximally selected chi‐square statistics in the case of a binary Y variable and a nominal X variable with several categories. The distribution of the maximally selected chi‐square statistic has already been derived when the best cutpoint is chosen from a continuous or an ordinal X , but not when the best split is chosen from a nominal X . In this paper, we derive the exact distribution of the maximally selected chi‐square statistic in this case using a combinatorial approach. Applications of the derived distribution to variable selection and hypothesis testing are discussed based on simulations. As an illustration, our method is applied to a birth data set. (© 2006 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)