Premium
Rerandomization and regression adjustment
Author(s) -
Li Xinran,
Ding Peng
Publication year - 2020
Publication title -
journal of the royal statistical society: series b (statistical methodology)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.523
H-Index - 137
eISSN - 1467-9868
pISSN - 1369-7412
DOI - 10.1111/rssb.12353
Subject(s) - covariate , estimator , statistics , regression analysis , regression , analysis of covariance , inference , mathematics , covariance , linear regression , statistical inference , econometrics , computer science , artificial intelligence
Summary Randomization is a basis for the statistical inference of treatment effects without strong assumptions on the outcome‐generating process. Appropriately using covariates further yields more precise estimators in randomized experiments. R. A. Fisher suggested blocking on discrete covariates in the design stage or conducting analysis of covariance in the analysis stage. We can embed blocking in a wider class of experimental design called rerandomization, and extend the classical analysis of covariance to more general regression adjustment. Rerandomization trumps complete randomization in the design stage, and regression adjustment trumps the simple difference‐in‐means estimator in the analysis stage. It is then intuitive to use both rerandomization and regression adjustment. Under the randomization inference framework, we establish a unified theory allowing the designer and analyser to have access to different sets of covariates. We find that asymptotically, for any given estimator with or without regression adjustment, rerandomization never hurts either the sampling precision or the estimated precision, and, for any given design with or without rerandomization, our regression‐adjusted estimator never hurts the estimated precision. Therefore, combining rerandomization and regression adjustment yields better coverage properties and thus improves statistical inference. To quantify these statements theoretically, we discuss optimal regression‐adjusted estimators in terms of the sampling precision and the estimated precision, and then measure the additional gains of the designer and the analyser. We finally suggest the use of rerandomization in the design and regression adjustment in the analysis followed by the Huber–White robust standard error.