z-logo
Premium
New variable selection methods for zero‐inflated count data with applications to the substance abuse field
Author(s) -
Buu Anne,
Johnson Norman J.,
Li Runze,
Tan Xianming
Publication year - 2011
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.4268
Subject(s) - count data , zero inflated model , statistics , poisson distribution , computer science , poisson regression , logistic regression , field (mathematics) , zero (linguistics) , econometrics , inflation (cosmology) , ignorance , substance abuse , overdispersion , feature selection , variable (mathematics) , mathematics , medicine , artificial intelligence , psychiatry , environmental health , population , linguistics , philosophy , physics , epistemology , theoretical physics , pure mathematics , mathematical analysis
Zero‐inflated count data are very common in health surveys. This study develops new variable selection methods for the zero‐inflated Poisson regression model. Our simulations demonstrate the negative consequences which arise from the ignorance of zero‐inflation. Among the competing methods, the one‐step SCAD method is recommended because it has the highest specificity, sensitivity, exact fit, and lowest estimation error. The design of the simulations is based on the special features of two large national databases commonly used in the alcoholism and substance abuse field so that our findings can be easily generalized to the real settings. Applications of the methodology are demonstrated by empirical analyses on the data from a well‐known alcohol study. Copyright © 2011 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here