« Back to Results

Applied Machine Learning

Paper Session

Friday, Jan. 4, 2019 2:30 PM - 4:30 PM

Atlanta Marriott Marquis, Marquis Ballroom A
Hosted By: American Economic Association
  • Chair: Alberto Abadie, Massachusetts Institute of Technology

Causal Methods for Panel Data

Susan Athey
,
Stanford University
Guido Imbens
,
Stanford University

Abstract

In estimation of treatment effects in panel data settings, researchers have often used fixed effect models where variation between units and over time is captured completely by additive components. Such specifications may be restrictive if in fact there is heterogeneity in the treatment effects between units and over time. In this paper we explore machine learning methods to assess such heterogeneity and to develop richer, data-driven specifications. A key challenge is that simply treating the unit and time indicators as features leads to models with many features where sparsity may not be plausible. Instead we develop methods that build flexible models for the heterogeneity between units and over time that can be viewed as generalizing fixed effect models.

Pre-Analysis Plans in the Machine-Learning Era

Jens Ludwig
,
University of Chicago
Sendhil Mullainathan
,
University of Chicago
Jann Spiess
,
Microsoft Research

Abstract

Concerns about the dissemination of spurious results have led to calls for pre-analysis plans (PAPs), which force ex-ante specificity in order to avoid ex-post “p-hacking.” But in many cases the conceptual hypotheses being tested do not directly imply the level of specificity required for a PAP: estimating an average treatment effect requires the pre-specification of control variables and how they enter the regression; analysing heterogeneous treatment effects necessitates an explicit list of subgroups or interaction effects; testing for effects on many outcome variables or checking for balance relies on a predetermined way of combining evidence across many variables. At the same time, machine-learning (ML) techniques have been developed that engage in principled ex-post specification searches to select control variables, find subgroups with different treatment effects, or combine many variables into a single test. In this paper we suggest a framework for pre-analysis plans that capitalize on the availability of these new techniques, in which researchers combine specific aspects of the analysis that they care about or have strong priors over with ML for the flexible estimation of unspecific (or partially specific) remainders. When such machine-augmented pre-analysis plans spell out in detail how and what ML procedures will be used, they produce properly sized tests. A “cheap-lunch” theorem shows that the inclusion of ML in this way produces limited worst-case costs in power, while offering a substantial upside from systematic specification searches when the non-parametric remainder carries signal about the hypotheses of interest. These results suggest the careful integration of ML provides two gains over existing PAPs: it (i) limits the need of researchers to make arbitrary choices in their analysis that are not implied by the initial conceptual hypothesis being tested; and it (ii) integrates ex-post analysis without fear of p-hacking.

Causal Impact of Democracy on Growth: An Applied Econometrics Perspective

Ivan Fernandez-Val
,
Boston University
Victor Chernozhukov
,
Massachusetts Institute of Technology

Abstract

The relationship between democracy and economic growth is of long standing interest in Economics. We revisit the empirical analysis of Acemoglu, Naidu, Restrepo and Robinson (forthcoming in the Journal of Political Economy) using state of the art econometric methods. We consider variations of the GMM Arellano-Bond and fixed effects estimators of a dynamic linear panel data model with country and time fixed effects. We find that both methods produce similar estimates of the short-run and long-run effects of democracy on growth once the GMM estimator is bias-corrected for the many instrument problem and the fixed effect estimator is bias-corrected for the incidental parameter problem. Our estimated effects show that the finding that democracy does cause growth is not sensitive to the econometric methodology.
JEL Classifications
  • C2 - Single Equation Models; Single Variables