Econometrics of Randomized Experiments
Paper Session
Sunday, Jan. 8, 2017 3:15 PM – 5:15 PM
Hyatt Regency Chicago, Water Tower
- Chair: Alexander Torgovitsky, Northwestern University
Inference With Covariate-Adaptive Randomization
Abstract
This paper studies inference for the average treatment effect in randomized controlled trials with covariate-adaptive randomization. Here, by covariate-adaptive randomization, we mean randomization schemes that first stratify according to baseline covariates and then assign treatment status so as to achieve "balance" within each stratum. Such schemes include, for example, Efron's biased-coin design and stratified block randomization. When testing the null hypothesis that the average treatment effect equals a pre-specified value in such settings, we first show that the usual two-sample $t$-test is conservative in the sense that it has limiting rejection probability under the null hypothesis no greater than and typically strictly less than the nominal level. In a simulation study, we find that the rejection probability may in fact be dramatically less than the nominal level. We show further that these same conclusions remain true for a naive permutation test, but that a modified version of the permutation test yields a test that is non-conservative in the sense that its limiting rejection probability under the null hypothesis equals the nominal level for a wide variety of randomization schemes. The modified version of the permutation test has the additional advantage that it has rejection probability exactly equal to the nominal level for some distributions satisfying the null hypothesis and some randomization schemes. Finally, we show that the usual $t$-test (on the coefficient on treatment assignment) in a linear regression of outcomes on treatment assignment and indicators for each of the strata yields a non-conservative test as well under even weaker assumptions on the randomization scheme. In a simulation study, we find that the non-conservative tests have substantially greater power than the usual two-sample $t$-test.Optimal Data Collection for Randomized Control Trials
Abstract
In a randomized control trial, the precision of an average treatment effect estimator can be improved either by collecting data on additional individuals, or by collecting additional covariates that predict the outcome variable. We propose the use of pre-experimental data such as a census, or a household survey, to inform the choice of both the sample size and the covariates to be collected. Our procedure seeks to minimize the resulting average treatment effect estimator's mean squared error, subject to the researcher's budget constraint. We rely on a modification of an orthogonal greedy algorithm that is conceptually simple and easy to implement in the presence of a large number of potential covariates, and does not require any tuning parameters. In two empirical applications, we show that our procedure can lead to substantial gains of up to 58%, measured either in terms of reductions in data collection costs or in terms of improvements in the precision of the treatment effect estimator.Combining RCTs and Selection Models for External Validity
Abstract
Randomized Control Trials (RCTs) and observational data are typically viewed as substitutes for the evaluation of treatment effects. The estimates from RCTs are commonly seen as the ``gold standard'' for evaluating treatment effects, to be relied upon exclusively when available, while evidence from observational data is commonly seen as a poor substitute for experimental evidence and is to be used, if at all, only when evidence from an RCT is unavailable. With few exceptions, the literature for combining evidence from RCTs and observational studies when both are available has done so not to better evaluate treatment effects, but rather as a way to evaluate the validity of the nonexperimental approaches applied to the observational data.In contrast, in this paper we develop a methodology to combine evidence from an RCT with results from observational studies to leverage strengths from both approaches. In particular, this study considers the nonparametric selection model/Local Instrumental Variables approach of Heckman and Vytlacil (2005) applied to observational data, combined with analysis from an RCT. We demonstrate that, by combining the two approaches on the two types of data, one can obtain a deeper understanding of the connection between selection and treatment effects than would be possible with either approach in isolation. In addition, combining the two approaches allows greater external validity than would be possible with the RCT alone; more robust analysis than would be possible with the selection model alone; and solves the problem of identification-at-infinity within selection models.
JEL Classifications
- C0 - General