Premium
Regression Trees Identify Relevant Interactions: Can This Improve the Predictive Performance of Risk Adjustment?
Author(s) -
Buchner Florian,
Wasem Jürgen,
Schillo Sonja
Publication year - 2017
Publication title -
health economics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.55
H-Index - 109
eISSN - 1099-1050
pISSN - 1057-9230
DOI - 10.1002/hec.3277
Subject(s) - regression , set (abstract data type) , decision tree , regression analysis , statistics , data set , tree (set theory) , econometrics , computer science , sample (material) , mathematics , data mining , mathematical analysis , chemistry , chromatography , programming language
Risk equalization formulas have been refined since their introduction about two decades ago. Because of the complexity and the abundance of possible interactions between the variables used, hardly any interactions are considered. A regression tree is used to systematically search for interactions, a methodologically new approach in risk equalization. Analyses are based on a data set of nearly 2.9 million individuals from a major German social health insurer. A two‐step approach is applied: In the first step a regression tree is built on the basis of the learning data set. Terminal nodes characterized by more than one morbidity‐group‐split represent interaction effects of different morbidity groups. In the second step the ‘traditional’ weighted least squares regression equation is expanded by adding interaction terms for all interactions detected by the tree, and regression coefficients are recalculated. The resulting risk adjustment formula shows an improvement in the adjusted R 2 from 25.43% to 25.81% on the evaluation data set. Predictive ratios are calculated for subgroups affected by the interactions. The R 2 improvement detected is only marginal. According to the sample level performance measures used, not involving a considerable number of morbidity interactions forms no relevant loss in accuracy. Copyright © 2015 John Wiley & Sons, Ltd.