
Addressing missing data in randomized clinical trials: A causal inference perspective
Author(s) -
Ilja Cornelisz,
Pim Cuijpers,
Tara Donker,
Chris van Klaveren
Publication year - 2020
Publication title -
plos one
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.99
H-Index - 332
ISSN - 1932-6203
DOI - 10.1371/journal.pone.0234349
Subject(s) - causal inference , estimator , inference , missing data , attrition , econometrics , selection bias , bounding overwatch , randomized experiment , point estimation , statistics , outcome (game theory) , randomized controlled trial , average treatment effect , computer science , perspective (graphical) , mathematics , medicine , artificial intelligence , surgery , dentistry , mathematical economics
Background The importance of randomization in clinical trials has long been acknowledged for avoiding selection bias. Yet, bias concerns re-emerge with selective attrition. This study takes a causal inference perspective in addressing distinct scenarios of missing outcome data (MCAR, MAR and MNAR). Methods This study adopts a causal inference perspective in providing an overview of empirical strategies to estimate the average treatment effect, improve precision of the estimator, and to test whether the underlying identifying assumptions hold. We propose to use Random Forest Lee Bounds (RFLB) to address selective attrition and to obtain more precise average treatment effect intervals. Results When assuming MCAR or MAR, the often untenable identifying assumptions with respect to causal inference can hardly be verified empirically. Instead, missing outcome data in clinical trials should be considered as potentially non-random unobserved events (i.e. MNAR). Using simulated attrition data, we show how average treatment effect intervals can be tightened considerably using RFLB, by exploiting both continuous and discrete attrition predictor variables. Conclusions Bounding approaches should be used to acknowledge selective attrition in randomized clinical trials in acknowledging the resulting uncertainty with respect to causal inference. As such, Random Forest Lee Bounds estimates are more informative than point estimates obtained assuming MCAR or MAR.