Premium
Semiparametric mixed‐scale models using shared Bayesian forests
Author(s) -
Linero Antonio R.,
Sinha Debajyoti,
Lipsitz Stuart R.
Publication year - 2020
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/biom.13107
Subject(s) - heteroscedasticity , homoscedasticity , covariate , semiparametric regression , computer science , bayesian probability , econometrics , nonparametric regression , nonparametric statistics , model selection , regression analysis , bayesian linear regression , multivariate statistics , bayesian information criterion , bayesian inference , statistics , machine learning , mathematics , artificial intelligence
This paper demonstrates the advantages of sharing information about unknown features of covariates across multiple model components in various nonparametric regression problems including multivariate, heteroscedastic, and semicontinuous responses. In this paper, we present a methodology which allows for information to be shared nonparametrically across various model components using Bayesian sum‐of‐tree models. Our simulation results demonstrate that sharing of information across related model components is often very beneficial, particularly in sparse high‐dimensional problems in which variable selection must be conducted. We illustrate our methodology by analyzing medical expenditure data from the Medical Expenditure Panel Survey (MEPS). To facilitate the Bayesian nonparametric regression analysis, we develop two novel models for analyzing the MEPS data using Bayesian additive regression trees—a heteroskedastic log‐normal hurdle model with a “shrink‐toward‐homoskedasticity” prior and a gamma hurdle model.