Premium
Adjusting for population differences using machine learning methods
Author(s) -
Cappiello Lauren,
Zhang Zhiwei,
Shen Changyu,
Butala Neel M.,
Cui Xinping,
Yeh Robert W.
Publication year - 2021
Publication title -
journal of the royal statistical society: series c (applied statistics)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.205
H-Index - 72
eISSN - 1467-9876
pISSN - 0035-9254
DOI - 10.1111/rssc.12486
Subject(s) - estimator , population , nonparametric statistics , context (archaeology) , machine learning , computer science , parametric statistics , artificial intelligence , statistics , econometrics , regression , mathematics , mathematical optimization , medicine , environmental health , paleontology , biology
Abstract The use of real‐world data for medical treatment evaluation frequently requires adjusting for population differences. We consider this problem in the context of estimating mean outcomes and treatment differences in a well‐defined target population, using clinical data from a study population that overlaps with but differs from the target population in terms of patient characteristics. The current literature on this subject includes a variety of statistical methods, which generally require correct specification of at least one parametric regression model. In this article, we propose to use machine learning methods to estimate nuisance functions and incorporate the machine learning estimates into existing doubly robust estimators. This leads to nonparametric estimators that are n ‐consistent, asymptotically normal and asymptotically efficient under general conditions. Simulation results demonstrate that the proposed methods perform reasonably well in realistic settings. The methods are illustrated with a cardiology example concerning aortic stenosis.