Premium
Models for zero‐inflated, correlated count data with extra heterogeneity: when is it too complex?
Author(s) -
Chebon Sammy,
Faes Christel,
Cools Frank,
Geys Helena
Publication year - 2016
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.7142
Subject(s) - overdispersion , covariate , count data , poisson regression , statistics , poisson distribution , econometrics , quasi likelihood , zero inflated model , model selection , mathematics , computer science , medicine , population , environmental health
Statistical analysis of count data typically starts with a Poisson regression. However, in many real‐life applications, it is observed that the variation in the counts is larger than the mean, and one needs to deal with the problem of overdispersion in the counts. Several factors may contribute to overdispersion: (1) unobserved heterogeneity due to missing covariates, (2) correlation between observations (such as in longitudinal studies), and (3) the occurrence of many zeros (more than expected from the Poisson distribution). In this paper, we discuss a model that allows one to explicitly take each of these factors into consideration. The aim of this paper is twofold: (1) investigate whether we can identify the cause of overdispersion via model selection, and (2) investigate the impact of a misspecification of the model on the power of a covariate. The paper is motivated by a study of the occurrence of drug‐induced arrhythmia in beagle dogs based on electrocardiogram recordings, with the objective to evaluate the effect of potential drugs on the heartbeat irregularities. Copyright © 2016 John Wiley & Sons, Ltd.