« Back to Results
Marriott Marquis, Presidio 1 - 2
Hosted By:
Econometric Society
Machine Learning and High Dimensional Methods
Paper Session
Saturday, Jan. 4, 2020 2:30 PM - 4:30 PM (PDT)
- Chair: Whitney Newey, Massachusetts Institute of Technology
Structural Estimation of Dynamic Equilibrium Models with Unstructured Data
Abstract
In this paper, we show how the estimation of structural DSGE models with unstructured data can be accomplished by merging standard state-space techniques and a Latent Dirichlet allocation (LDA) in an augmented state-space representation. The posterior distribution of parameters from the resulting representation can be sampled through the use of Markov Chain Monte Carlo algorithms, and it is readily amenable to massive parallelization.Demand Analysis with Many Prices
Abstract
From its inception, demand estimation has faced the problem of "many prices." While some aggregation across goods is always necessary, the problem of many prices remains even after aggregation. Economic theory shows that often the policy question of interest depends on only one, or a very few, price effects. For example, estimation of consumer surplus typically depends only on the own price effect since all other prices are held constant. Another common feature of data is that cross-price effects tend to be small. This paper uses Lasso to mitigate the curse of dimensionality in estimating the average expenditure share from cross-section data when cross-price effects are small. We estimate bounds on consumer surplus (BCS) using a novel double/debiased Lasso method. These bounds allow for multidimensional, nonseparable heterogeneity and solve the "zeros problem" of demand by including zeros in the estimation. We also use a control function to correct for endogeneity of total expenditure. As an additional contribution we use panel data to control for endogeneity of prices as well as expenditure. We average ridge regression individual slope estimators and bias correct for the regularization. We give inference theory when the number of time series observations is larger than the number of parameters, including primitive regularity conditions. We compare these methods in estimating the welfare effects of a tax on soda using scanner data. We find panel elasticities are substantially smaller than the cross section estimates, strongly suggesting that prices are endogenous.Machine Learning for Dynamic Discrete Choice
Abstract
Dynamic discrete choice models often discretize the state vector and restrict its dimension in order to achieve valid inference. I propose a novel two-stage estimator for the set-identied the structural parameter that incorporates a high-dimensional state space into the dynamic model of imperfect competition. In the rst stage, I estimate the state variable's law of motion and the equilibrium policy function using machine learning tools. In the second stage, I plug the rststage estimates into a moment inequality and solve for the structural parameter. The moment function is presented as the sum of two components, where the rst one expresses the equilibrium assumption and the second one is a bias correction term that makes the sum insensitive (i.e., Neyman-orthogonal) to rst-stage bias. The proposed estimator uniformly converges at the root-N rate and I use it to construct condence regions. The results developed here can be used to incorporate high-dimensional state space into classic dynamic discrete choice models, for example, those considered in Rust (1987), Bajari et al. (2007), and Scott (2013).How Is Machine Learning Useful for Macroeconomic Forecasting?
Abstract
We move beyond Is Machine Learning Useful for Macroeconomic Forecasting? by adding the how. The current forecasting literature has focused on matching specific variables and horizons with a particularly successful algorithm. To the contrary, we study a wide range of horizons and variables and learn about the usefulness of the underlying features driv- ing ML gains over standard macroeconometric methods. We distinguish 4 so-called fea- tures (nonlinearities, regularization, cross-validation and alternative loss function) and study their behavior in both the data-rich and data-poor environments. To do so, we carefully design a series of experiments that easily allow to identify the “treatment” ef- fects of interest. The fixed-effects regression setup prompt us to use a novel visualization technique for forecasting results that conveys all the relevant information in a digestible format. We conclude that (i) more data and non-linearities are very useful for real vari- ables at long horizons, (ii) the standard factor model remains the best regularization, (iii) cross-validations are not all made equal (but K-fold is as good as BIC) and (iv) one should stick with the standard L2 loss.Dynamically optimal treatment allocation using Reinforcement Learning
Abstract
Devising guidance on how to assign individuals to treatment is an important goal of empirical research. In practice individuals often arrive sequentially, and the planner faces various constraints such as limited budget/capacity, or borrowing constraints, or the need to place people in a queue. For instance, a governmental body may receive a budget outlay at the beginning of an year, and it may need to decide how best to allocate resources within the year to individuals who arrive sequentially. In this and other examples involving inter-temporal trade-offs, previous work on devising optimal policy rules in a static context is either not applicable, or is sub-optimal. Here we show how one can use offline observational data to estimate an optimal policy rule that maximizes ex-ante expected welfare in this dynamic context. We allow the class of policy rules to be restricted for computational, legal or incentive compatibility reasons. The problem is equivalent to one of optimal control under a constrained policy class, and we exploit recent developments in Reinforcement Learning (RL) to propose an algorithm to solve this. The algorithm is easily implementable and computationally efficient, with speedups achieved through multiple RL agents learning in parallel processes. We also characterize the statistical regret from using our estimated policy rule. To do this, we show that a Partial Differential Equation (PDE) characterizes the evolution of the value function under each policy. The data enables us to obtain a sample version of the PDE that provides estimates of these value functions. The estimated policy rule is the one with the maximal estimated value function. Using the theory of viscosity solutions to PDEs we show that the policy regret decays at a n^{-1/2} rate in most examples; this is the same rate as that obtained in the static case.JEL Classifications
- C1 - Econometric and Statistical Methods and Methodology: General
- C2 - Single Equation Models; Single Variables