« Back to Results

Cluster-Robust Econometric Methods

Paper Session

Friday, Jan. 5, 2024 10:15 AM - 12:15 PM (CST)

Marriott Rivercenter, Conference Room 20
Hosted By: International Association of Applied Econometrics
  • Chair: Bruce E. Hansen, University of Wisconsin

Dyad-Robust Inference for International Trade Data

Colin Cameron
,
University of California-Davis
Douglas Miller
,
Cornell University

Abstract

In this paper we consider inference with paired or dyadic data, such as cross-section and panel data on trade between pairs of countries. Regression models with such data can have a complicated pattern of error correlation. For example, errors for US-UK trade may be correlated with those for any other country pair that includes either the US or UK. The standard cluster-robust variance estimator or sandwich estimator based on one-way clustering on dyads is inadequate. The two-way cluster-robust estimator with clustering on each pair in the dyad is a substantial improvement, but still understates standard errors. Instead one should use dyadic-robust standard errors. Qualitatively similar issues arise in social network data analysis, but the consequences are especially severe in international trade studies since trade networks are typically very dense. Applications with the gravity model of trade rarely use dyadic-robust standard errors, though panel data applications do include rich sets of fixed effects that could potentially control for dyadic correlation. Using several leading applications we find that even when country-pair and country-time fixed effects are included the failure to additionally use dyadic-robust standard errors can lead to reported standard errors that are several times too small.

Inference in Cluster Randomized Experiments with Matched Pairs

Max Tabord-Meehan
,
University of Chicago
Yuehao Bai
,
University of Southern California
Jizjou Liu
,
University of Chicago
Azeem M. Shaikh
,
University of Chicago

Abstract

This paper considers the problem of inference in cluster randomized trials where treatment status is determined according to a "matched pairs'' design. Here, by a cluster randomized experiment, we mean one in which treatment is assigned at the level of the cluster; by a "matched pairs'' design we mean that a sample of clusters is paired according to baseline, cluster-level covariates and, within each pair, one cluster is selected at random for treatment. We study the large sample behavior of a weighted difference-in-means estimator and derive two distinct sets of results depending on if the matching procedure does or does not match on cluster size. We then propose a variance estimator which is consistent in either case. Combining these results establishes the asymptotic exactness of tests based on these estimators. Next, we consider the properties of two common testing procedures based on t-tests constructed from linear regressions, and argue that both are generally conservative in our framework. Finally, we study the behavior of a randomization test which permutes the treatment status for clusters within pairs, and establish its finite sample and asymptotic validity for testing specific null hypotheses. A simulation study confirms the practical relevance of our theoretical results.

Inference on Quantile Processes with a Finite Number of Clusters

Andreas Hagermann
,
University of Michigan

Abstract

I introduce a generic method for inference on entire quantile and regression quantile processes in the presence of a finite number of large and arbitrarily heterogeneous clusters. The method asymptotically controls size by generating statistics that exhibit enough distributional symmetry such that randomization tests can be applied. The randomization test does not require ex-ante matching of clusters, is free of user-chosen parameters, and performs well at conventional significance levels with as few as five clusters. The method tests standard (non-sharp) hypotheses and can even be asymptotically similar in empirically relevant situations. The main focus of the paper is inference on quantile treatment effects but the method applies more broadly. Numerical and empirical examples are provided.

Jackknife Standard Errors for Clustered Regression

Bruce E. Hansen
,
University of Wisconsin

Abstract

This paper presents a theoretical case for replacement of conventional
heteroskedasticity-consistent and cluster-robust variance estimators with
jackknife variance estimators, in the context of linear regression with
heteroskedastic and/or cluster-dependent observations. We examine the bias of
variance estimation and the coverage probabilities of confidence intervals.
Concerning bias, we show that conventional variance estimators have full
downward worst-case bias, while our jackknife variance estimator is never
downward biased. Concerning confidence intervals, we show that intervals based
on conventional standard errors have worst-case coverage equalling zero, while
our jackknife-based confidence interval has coverage probability bounded by
the Cauchy distribution. We also extend the Bell-McCaffrey (2002) student t
approximation to our jackknife t-ratio, resulting in confidence intervals
with improved coverage probabilities. Our theory holds under minimal
assumptions, allowing arbitrary cluster sizes, regressor leverage,
within-cluster correlation, heteroskedasticity, regression with a single
treated cluster, fixed effects, and delete-cluster invertibility failures. Our
theoretical findings are consistent with the extensive simulation literature
investigating heteroskedasticity-consistent and cluster-robust variance estimation.
JEL Classifications
  • C1 - Econometric and Statistical Methods and Methodology: General
  • C2 - Single Equation Models; Single Variables