Premium
Test of Marginal Compatibility and Smoothing Methods for Exchangeable Binary Data with Unequal Cluster Sizes
Author(s) -
Pang Zhen,
Kuk Anthony Y. C.
Publication year - 2007
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/j.1541-0420.2006.00678.x
Subject(s) - mathematics , statistics , smoothing , binary data , covariate , nonparametric statistics , binary number , computer science , arithmetic
Summary Exchangeable binary data are often collected in developmental toxicity and other studies, and a whole host of parametric distributions for fitting this kind of data have been proposed in the literature. While these distributions can be matched to have the same marginal probability and intra‐cluster correlation, they can be quite different in terms of shape and higher‐order quantities of interest such as the litter‐level risk of having at least one malformed fetus. A sensible alternative is to fit a saturated model (Bowman and George, 1995, Journal of the American Statistical Association 90, 871–879) using the expectation‐maximization (EM) algorithm proposed by Stefanescu and Turnbull (2003, Biometrics 59, 18–24). The assumption of compatibility of marginal distributions is often made to link up the distributions for different cluster sizes so that estimation can be based on the combined data. Stefanescu and Turnbull proposed a modified trend test to test this assumption. Their test, however, fails to take into account the variability of an estimated null expectation and as a result leads to inaccurate p ‐values. This drawback is rectified in this article. When the data are sparse, the probability function estimated using a saturated model can be very jagged and some kind of smoothing is needed. We extend the penalized likelihood method (Simonoff, 1983, Annals of Statistics 11, 208–218) to the present case of unequal cluster sizes and implement the method using an EM‐type algorithm. In the presence of covariate, we propose a penalized kernel method that performs smoothing in both the covariate and response space. The proposed methods are illustrated using several data sets and the sampling and robustness properties of the resulting estimators are evaluated by simulations.