Premium
Motor insurance claim modelling with factor collapsing and Bayesian model averaging
Author(s) -
Hu Sen,
O'Hagan Adrian,
Murphy Thomas Brendan
Publication year - 2018
Publication title -
stat
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.61
H-Index - 18
ISSN - 2049-1573
DOI - 10.1002/sta4.180
Subject(s) - interpretability , categorical variable , econometrics , cluster analysis , model selection , bayesian probability , feature selection , bayesian inference , selection (genetic algorithm) , life insurance , computer science , factor analysis , bayes factor , linear model , feature (linguistics) , mathematics , artificial intelligence , machine learning , economics , actuarial science , linguistics , philosophy
While generalized linear models have become the insurance industry's standard approach for claim modelling, the approach of utilizing a single best model on which predictions are based ignores model selection uncertainty. An additional feature of insurance claim data sets is the common presence of categorical variables, within which the number of levels is high, and not all levels may be statistically significant. In such cases, some subsets of the levels may be merged to give a smaller overall number of levels for improved model parsimony and interpretability. Hence, clustering of the levels poses an additional model uncertainty issue. A method is proposed for assessing the optimal manner of collapsing factors with many levels into factors with smaller numbers of levels, and Bayesian model averaging is used to blend model predictions from all reasonable models to account for selection uncertainty. This method will be computationally intensive when the number of factors being collapsed or the number of levels within factors increases. Hence, a stochastic approach is used to quickly identify the best collapsing cases across the model space. Copyright © 2018 John Wiley & Sons, Ltd.