Premium
Prior elicitation, variable selection and Bayesian computation for logistic regression models
Author(s) -
Chen M.H.,
Ibrahim J. G.,
Yiannoutsos C.
Publication year - 1999
Publication title -
journal of the royal statistical society: series b (statistical methodology)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.523
H-Index - 137
eISSN - 1467-9868
pISSN - 1369-7412
DOI - 10.1111/1467-9868.00173
Subject(s) - computer science , prior probability , logistic regression , computation , approximate bayesian computation , bayesian linear regression , model selection , feature selection , bayesian probability , variable (mathematics) , posterior probability , gibbs sampling , selection (genetic algorithm) , machine learning , artificial intelligence , data mining , bayesian inference , algorithm , mathematics , inference , mathematical analysis
Bayesian selection of variables is often difficult to carry out because of the challenge in specifying prior distributions for the regression parameters for all possible models, specifying a prior distribution on the model space and computations. We address these three issues for the logistic regression model. For the first, we propose an informative prior distribution for variable selection. Several theoretical and computational properties of the prior are derived and illustrated with several examples. For the second, we propose a method for specifying an informative prior on the model space, and for the third we propose novel methods for computing the marginal distribution of the data. The new computational algorithms only require Gibbs samples from the full model to facilitate the computation of the prior and posterior model probabilities for all possible models. Several properties of the algorithms are also derived. The prior specification for the first challenge focuses on the observables in that the elicitation is based on a prior prediction y 0 for the response vector and a quantity a 0 quantifying the uncertainty in y 0 . Then, y 0 and a 0 are used to specify a prior for the regression coefficients semi‐automatically. Examples using real data are given to demonstrate the methodology.