Ridge rerandomization: an experimental design strategy in the presence of covariate collinearity

Abstract

Randomization ensures that observed and unobserved covariates are balanced, on average. However, randomizing units to treatment and control often leads to covariate imbalances in realization, and such imbalances can inflate the variance of estimators of the treatment effect. One solution to this problem is rerandomization—an experimental design strategy that randomizes units until some balance criterion is fulfilled—which yields more precise estimators of the treatment effect if covariates are correlated with the outcome. Most rerandomization schemes in the literature utilize the Mahalanobis distance, which may not be preferable when covariates are correlated or vary in importance. As an alternative, we introduce an experimental design strategy called ridge rerandomization, which utilizes a modified Mahalanobis distance that addresses collinearities among covariates and automatically places a hierarchy of importance on the covariates according to their eigenstructure. This modified Mahalanobis distance has connections to principal components and the Euclidean distance, and—to our knowledge—has remained unexplored. We establish several theoretical properties of this modified Mahalanobis distance and our ridge rerandomization scheme. These results guarantee that ridge rerandomization is preferable over randomization and suggest when ridge rerandomization is preferable over standard rerandomization schemes. We also provide simulation evidence that suggests that ridge rerandomization is particularly preferable over typical rerandomization schemes in high-dimensional or high-collinearity settings.

Publication
Journal of Statistical Planning and Inference, 211, 287-314
Date
Links