z-logo
Premium
Variable selection for zero‐inflated and overdispersed data with application to health care demand in Germany
Author(s) -
Wang Zhu,
Ma Shuangge,
Wang ChingYun
Publication year - 2015
Publication title -
biometrical journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.108
H-Index - 63
eISSN - 1521-4036
pISSN - 0323-3847
DOI - 10.1002/bimj.201400143
Subject(s) - lasso (programming language) , feature selection , statistics , mathematics , minimax , binomial regression , selection (genetic algorithm) , penalty method , model selection , expectation–maximization algorithm , negative binomial distribution , count data , scad , mathematical optimization , logistic regression , econometrics , computer science , maximum likelihood , medicine , artificial intelligence , psychiatry , world wide web , myocardial infarction , poisson distribution
In health services and outcome research, count outcomes are frequently encountered and often have a large proportion of zeros. The zero‐inflated negative binomial (ZINB) regression model has important applications for this type of data. With many possible candidate risk factors, this paper proposes new variable selection methods for the ZINB model. We consider maximum likelihood function plus a penalty including the least absolute shrinkage and selection operator (LASSO), smoothly clipped absolute deviation (SCAD), and minimax concave penalty (MCP). An EM (expectation‐maximization) algorithm is proposed for estimating the model parameters and conducting variable selection simultaneously. This algorithm consists of estimating penalized weighted negative binomial models and penalized logistic models via the coordinated descent algorithm. Furthermore, statistical properties including the standard error formulae are provided. A simulation study shows that the new algorithm not only has more accurate or at least comparable estimation, but also is more robust than the traditional stepwise variable selection. The proposed methods are applied to analyze the health care demand in Germany using the open‐source R package mpath .

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here