« Back to Results

Big Data and Forecasting

Paper Session

Sunday, Jan. 3, 2021 3:45 PM - 5:45 PM (EST)

Hosted By: Econometric Society
  • Chair: Eric Ghysels, University of North Carolina-Chapel Hill

Estimation and HAC-Based Inference for Machine Learning Time Series Regressions

Eric Ghysels
,
University of North Carolina-Chapel Hill
Andrii Babii
,
University of North Carolina-Chapel Hill
Jonas Striaukas
,
Catholic University of Louvain

Abstract

Time series regression analysis in econometrics typically involves a framework relying on a set of mixing conditions to establish consistency and asymptotic normality of parameter estimates and HAC-type estimators of the residual long-run variances to conduct proper inference. This article introduces structured machine learning regressions for high-dimensional time series data using the aforementioned commonly used setting. To recognize the time series data structures we rely on the sparse-group LASSO estimator. We derive a new Fuk-Nagaev inequality for a class of τ -dependent processes with heavier than Gaussian tails, nesting α-mixing processes as a special case, and establish estimation, prediction, and inferential properties, including convergence rates of the HAC estimator for the long-run variance based on LASSO residuals. An empirical application to nowcasting US GDP growth indicates that the estimator performs favorably compared to other alternatives and that the text data can be a useful addition to more traditional numerical data.

Measurement of Factor Strength: Theory and Practice

Natalia Bailey
,
Monash University
George Kapetanios
,
King's College London
Hashem Pesaran
,
University of Southern California

Abstract

This paper proposes an estimator of factor strength and establishes its consistency and asymptotic distribution. The estimator is based on the number of statistically significant factor loadings, taking multiple testing into account. Both cases of observed, and unobserved factors are considered. The small sample properties of the proposed estimator are investigated using Monte Carlo experiments. It is shown that the proposed estimation and inference procedures perform well, and have excellent power properties, especially when the factor strength is sufficiently high. Empirical applications to factor models for asset returns show that out of 146 factors recently considered in the literature, only the market factor is truly strong, while all other factors are at best semi-strong, with their strength varying considerably over time. Similarly, we only find evidence of semi-strong factors using a large number of U.S. macroeconomic indicators.

Instability in Risk Premia

Simon Smith
,
Federal Reserve Board
Allan Timmermann
,
University of California-San Diego

Abstract

We apply a new methodology for identifying pervasive and discrete changes ("breaks")
in cross-sectional risk premia and find empirical evidence that these are economically
important for understanding returns on US stocks. Risk premia on the market, size,
and value factors have declined systematically over time with a particularly notable
reduction after the 2008-09 Global Financial Crisis. We construct a new instability
risk factor from cross-sectional differences in individual stocks' exposure to time-varying
risk premia and show that this factor earns a premium comparable to that of
commonly used risk factors. Using industry- and characteristics-sorted portfolios, we
show that some breaks to the return premium process are broad-based, affecting all
stocks regardless of industry- or firm characteristics, while others are limited to stocks
with specific style characteristics. Moreover, we identify distinct lead-lag patterns in
how breaks to the risk premium process impact stocks in different industries and with
different style characteristics.

Predicting Binary Outcomes Based on the Pair-Copula Construction

Kajal Lahiri
,
State University of New York-Albany
Liu Yang
,
Nanjing University

Abstract

This article develops a new econometric model for the purpose of predicting binary outcomes. The method uses the pair-copula construction (PCC) to optimally combine diverse information contained in a set of predictors. As a building block of PCC, the conditional copula is allowed to depend on the conditioning variables directly. We apply the methodology to predict U.S. business cycles six months ahead. In terms of the predictive accuracy as measured by the reciever operating characteristic (ROC) curve, the proposed scheme is found to outperform the existing combination models, as well as each single predictor. We have also evaluated the probability forecasts generated from these models using a set of diagnostic tools, each of which reveals different aspect with respect to the skill of the considered forecasts.

Advances in Nowcasting Economic Activity: Secular Trends, Large Shocks and New Data

Thomas Drechsel
,
University of Maryland
Juan Antolin-Diaz
,
London Business School
Ivan Petrella
,
University of Warwick

Abstract

The assessment of macroeconomic conditions in real time is challenging. Dynamic factor models, which summarize the comovement across many macroeconomic time series as driven by a small number of shocks, have become the workhorse tool for ‘nowcasting’ activity. This paper develops a novel dynamic factor model that explicitly captures three salient features of modern business cycles: low frequency movements in long-run growth and volatility, lead-lag patterns in the responses of variables to common shocks, and fat-tailed outliers. We use real-time unrevised data for the last two decades and cloud computing technology to conduct an out-of-sample evaluation exercise of the model. The exercise demonstrates the importance of considering these features for forecasting and probability assessment of economic conditions. In an application to the COVID-19 recession, we develop a method to incorporate newly available high-frequency data. The use of such alternative data is essential to track the downturn in activity, but a careful econometric specification is just as important.

The Factor Structure of Disagreement

Edward Herbst
,
Federal Reserve Board
Fabian Winkler
,
Federal Reserve Board

Abstract

We use Bayesian methods to estimate a three-dimensional dynamic factor model on individual forecasts in the Survey of Professional Forecasters. The factors extract the most important dimensions along which forecast disagreement comoves across macroeconomic variables. We interpret our results through a generic semi-structural model where heterogeneous expectations arise because of dispersed information. Up until the Great Moderation, the factors describe disagreement about the supply side of the economy, while in recent years and particularly during the Great Recession, disagreement about the demand side of the economy has become more important. Disagreement about the course of monetary policy seems to play a minor role in the data. Our findings can serve to discipline structural models of heterogeneous expectations.
JEL Classifications
  • C3 - Multiple or Simultaneous Equation Models; Multiple Variables
  • C5 - Econometric Modeling