« Back to Results

Uses of Imputation in Economic Analysis

Paper Session

Friday, Jan. 5, 2024 2:30 PM - 4:30 PM (CST)

Grand Hyatt, Presidio B

Hosted By: American Economic Association

Chair: Serena Ng, Columbia University

Parameter Recovery with Remotely Sensed Variables

Jonathan Proctor

Harvard University

Tamma Carleton

University of California-Santa Barbara

Sandy Sum

University of California-Santa Barbara

View Abstract

Abstract

Remotely sensed measurements and other machine learning predictions are increasingly used in place of direct observations in empirical analyses. Errors in such measures may bias parameter estimation, but it remains unclear how large such biases are or how to correct for them. We show empirically that using remotely sensed variables without correction leads to substantial bias in point estimates and standard errors across a diversity of models. We demonstrate that multiple imputation, a standard and easily implementable statistical imputation technique that has yet to be tested in this setting, effectively reduces bias and improves statistical coverage in both cross-sectional and panel data designs. Paper can be found here: https://www.nber.org/papers/w30861

Imputing Missing Values in the U.S. Census Bureau’s County Business Pattern

Fabian Eckert

University of California-San Diego

Teresa Fort

Dartmouth College

Peter Schott

Yale University

Natalie J. Yang

Columbia University

View Abstract

Abstract

The County Business Patterns data published by the US Census Bureau track employment by county and industry from 1946 to the present. Two features of the data limit their usefulness to researchers: (1) employment for the majority of county-industry cells is suppressed to protect confidentiality, and (2) industry classifications change over time. We address both issues. First, we develop a linear programming method that exploits the large set of adding-up constraints implicit in the hierarchical arrangement of the data to impute missing employment. Second, we provide concordances to map all data to a consistent set of industry codes. Finally, we construct a user-friendly, 1975 to 2018 county-level panel that classifies industries according to a consistent set of 2012 NAICS codes in all years. Paper can be found here: https://www.nber.org/papers/w26632

Fixed-Effects PCA: Imputation and Inference for Large Non-stationary Panel Data with Missing Observations

Junting Duan

Stanford University

Markus Pelger

Stanford University

Ruoxuan Xiong

Emory University

View Abstract

Abstract

Fixed-Effects PCA: Imputation and Inference for Large Non-Stationary Panel Data with Missing Observations Abstract: This paper studies the imputation and inference for large dimensional non-stationary panel data with missing observations. We propose the novel method, Fixed-Effects PCA (FE-PCA), for estimating a latent factor structure with non-stationary two-way fixed effects. FE-PCA is simple-to-use and applicable to general missing patterns, which can depend on both the latent factor structure and the two-way fixed effects. We show the consistency and asymptotic normality of the estimated fixed-effects and factor model under general assumptions. The generality of our framework is particularly important for causal inference in panels, where the unobserved counterfactual outcomes can be modeled as missing values. For two well-known causal applications, we demonstrate that FE-PCA can lead to different and more credible economic conclusions compared to conventional difference-in-differences and PCA methods.

Missing Data in Asset Pricing Panels

Joachim Freyberger

University of Bonn

Bjorn Hoppner

University of Bonn

Andreas Neuhierl

Washington University-St. Louis

Michael Weber

University of Chicago

View Abstract

Abstract

Missing data for return predictors is a common problem in cross sectional asset pricing. Most papers do not explicitly discuss how they deal with missing data but conventional treatments focus on the subset of firms with no missing data for any predictor or impute the unconditional mean. Both methods have undesirable properties - they are either inefficient or lead to biased estimators and incorrect inference. We propose a simple and computationally attractive alternative using conditional mean imputations and weighted least squares, cast in a generalized method of moments (GMM) framework. This method allows us to use all observations with observed returns, it results in valid inference, and it can be applied in non-linear and high-dimensional settings. In Monte Carlo simulations, we find that it performs almost as well as the efficient but computationally costly GMM estimator in many cases. We apply our procedure to a large panel of return predictors and find that it leads to improved out-of-sample predictability. Paper can be found here: https://www.nber.org/papers/w30761

JEL Classifications

C1 - Econometric and Statistical Methods and Methodology: General
C4 - Econometric and Statistical Methods: Special Topics

This website uses cookies.

Uses of Imputation in Economic Analysis

Friday, Jan. 5, 2024 2:30 PM - 4:30 PM (CST)

Parameter Recovery with Remotely Sensed Variables

Abstract

Imputing Missing Values in the U.S. Census Bureau’s County Business Pattern

Abstract

Fixed-Effects PCA: Imputation and Inference for Large Non-stationary Panel Data with Missing Observations

Abstract

Missing Data in Asset Pricing Panels

Abstract

JEL Classifications