Premium
An Optimal Semiparametric Method for Two‐group Classification
Author(s) -
Baek Seungchul,
Komori Osamu,
Ma Yanyuan
Publication year - 2018
Publication title -
scandinavian journal of statistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.359
H-Index - 65
eISSN - 1467-9469
pISSN - 0303-6898
DOI - 10.1111/sjos.12323
Subject(s) - mathematics , estimator , linear discriminant analysis , statistic , covariance , discriminant , group (periodic table) , parametric statistics , sufficient statistic , mathematical optimization , statistics , artificial intelligence , computer science , chemistry , organic chemistry
In the classical discriminant analysis, when two multivariate normal distributions with equal variance–covariance matrices are assumed for two groups, the classical linear discriminant function is optimal with respect to maximizing the standardized difference between the means of two groups. However, for a typical case‐control study, the distributional assumption for the case group often needs to be relaxed in practice. Komori et al. (Generalized t ‐statistic for two‐group classification. Biometrics 2015, 71: 404–416) proposed the generalized t ‐statistic to obtain a linear discriminant function, which allows for heterogeneity of case group. Their procedure has an optimality property in the class of consideration. We perform a further study of the problem and show that additional improvement is achievable. The approach we propose does not require a parametric distributional assumption on the case group. We further show that the new estimator is efficient, in that no further improvement is possible to construct the linear discriminant function more efficiently. We conduct simulation studies and real data examples to illustrate the finite sample performance and the gain that it produces in comparison with existing methods.