« Back to Results

Poster Session in Econometrics

Poster Session

Saturday, Jan. 4, 2020 8:00 AM - 10:00 AM (PDT)

Marriott Marquis, Marina Ballroom E
Hosted By: Econometric Society
  • Chair: Matias Cattaneo, Princeton University

Testing Stochastic Dominance with Many Conditioning Variables

Yoon-Jae Whang
,
Seoul National University

Abstract

We propose a test of the hypothesis of conditional stochastic dominance in the presence of many conditioning variables (whose dimension may grow to infinity as the sample size diverges). Our approach builds on a semiparametric location scale model in the sense that the conditional distribution of the outcome given the covariates is characterized by a nonparametric mean function and a nonparametric skedastic function with an independent innovation whose distribution is unknown. We propose to estimate the nonparametric mean and skedastic regression functions by the $ell _{1}$% -penalized nonparametric series estimation with thresholding. Under the sparsity assumption, where the number of truly relevant series terms are relatively small (but their identities are unknown), we develop the estimation error bounds for the regression functions and series coefficients estimates allowing for the time series dependence. We derive the asymptotic distribution of the test statistic, which is not pivotal asymptotically, and introduce the smooth stationary bootstrap to approximate its sample distribution. We investigate the finite sample performance of the bootstrap critical values by a set of Monte Carlo simulations. Finally, our method is illustrated by an application to stochastic dominance among portfolio returns given all the past informations.

Nonparametric Sample Splitting

Yoonseok Lee
,
Syracuse University
Yulong Wang
,
Syracuse University

Abstract

This paper develops a threshold regression model where an unknown relationship between two variables nonparametrically determines the threshold. We allow the observations to be cross-sectionally dependent so that the model can be applied to determine an unknown spatial border for sample splitting over a random field. We derive the uniform rate of convergence and the nonstandard limiting distribution of the nonparametric threshold estimator. We also obtain the root-n consistency and the asymptotic normality of the regression coefficient estimator. Our model has broad empirical relevance as illustrated by estimating the tipping point in social segregation problems as a function of demographic characteristics and determining metropolitan area boundaries using nighttime light intensity collected from satellite imagery. We find that the new empirical results are substantially different from the existing studies.

Party on: The Labor Market Returns to Social Networks and Socializing

Adriana Lleras-Muney
,
University of California-Los Angeles
Shuyang Sheng
,
University of California-Los Angeles
Veronica Sovero
,
Wake Forest University

Abstract

A person's schooling years are a formative time period for cognitive development, yet it is also a period of intense social interaction and friendship formation. In this paper we investigate whether socializing contributes to labor market productivity. We develop a model to explain how socializing and studying decisions influence educational attainment, network formation, and labor market outcomes. We document that individuals make investments to accumulate friends and other forms of social capital, consistent with the predictions of the model. We estimate that receiving five to six friend nominations has an impact on wages of approximately 10%, comparable to a broad set of estimates of the return to an additional year of schooling. This is the first estimate of the causal labor market returns to social capital.

Covariate Distribution Balance Via Propensity Scores

Pedro H. C. Sant'Anna
,
Vanderbilt University
Xiaojun Song
,
Peking University
Qi Xu
,
Vanderbilt University

Abstract

The propensity score plays an important role in causal inference with observational data. However, it is well documented that under slight model misspecifications, propensity score estimates based on maximum likelihood can lead to unreliable treatment effect estimators. To address this practical limitation, this article proposes a new framework for estimating propensity scores that mimics randomize control trials (RCT) in settings where only observational data is available. More specifically, given that in RCTs the joint distritbution of covariates are balanced between treated and not-treated groups, we propose to estimate the propensity score by maxizing the covariate distribution balance. The proposed propensity score estimators, which we call the integrated propensity score (IPS), are data-driven, do not rely on tuning parameters such as bandwidths, admit an asymptotic linear representation, and can be used to estimate many different treatment effect measures in a unified manner. We derive the asymptotic properties of inverse probability weighted estimators for the average, distributional and quantile treatment effects based on the IPS and illustrate their relative performance via Monte Carlo simulations and three empirical applications. An implementation of the proposed methods is provided in the new package exttt{IPS} for exttt{R}.

A New Parametrization of Correlation Matrices

Ilya Archakov
,
University of Vienna
Peter Hansen
,
University of North Carolina-Chapel Hill

Abstract

For the modeling of covariance matrices, the literature has proposed a variety of methods to enforce the positive (semi) definiteness. In this paper, we propose a method that is based on a novel parametrization of the correlation matrix, specifically the off-diagonal elements of the matrix logarithmic transformed correlations. This parametrization has many attractive properties, a wide range of applications, and may be viewed as a multivariate generalization of Fisher’s Z-transformation of a single correlation.

Proxy Controls and Panel Data

Ben Deaner
,
Massachusetts Institute of Technology

Abstract

We present a novel approach to nonparametric identification and consistent estimation in economic models using `proxy controls'. Our approach is particularly well-suited to the context of panel data with a fixed time-dimension but also applies in cross-sectional settings. Proxy controls are proxies for unobserved `perfect controls', where perfect controls are variables that are sufficient for the association between potential outcomes and treatments. Our identification strategy requires that the set of available proxy controls be split into two subsets, one subset acting as an instrument for the other. In the panel case, observations from different periods can be used as proxy controls and our key identifying assumptions follow from restrictions on the serial dependence of the data and confounding variables. We provide conditions under which our estimation problem is `well-posed'. Our estimator is straight-forward to implement, the key step is penalized sieve minimum distance estimation. We derive simple convergence rates under high-level assumptions.

Automated Solution of Heterogeneous Agent Models

David Childers
,
Carnegie Mellon University

Abstract

In this paper I present and analyze a new linearization based method for automated solution of heterogeneous agent models with continuously distributed heterogeneity and aggregate shocks. The approach is based on representation of the model equilibrium conditions as a system of smooth functional equations in terms of endogenously time-varying distributions and decision rules. Taking the value of these functions at a set of grid points as arguments, the equilibrium conditions can then be linearized, interpolated with respect to a set of basis functions, and solved through a procedure relying on automatic differentiation and standard discrete time linear rational expectations solution algorithms. While solution approaches based on linearization of discretized or projected models have achieved substantial popularity in recent years, it has been unclear whether such approaches generate solutions which correspond to that of the true infinite dimensional model. I characterize a broad class of models and a set of regularity conditions which ensure that this is indeed the case: the solution algorithm is guaranteed to converge to the first derivative of the true infinite dimensional solution as the discretization is refined.
The key conceptual result leading to these methods is a recognition that a broad variety of heterogeneous agent models can be interpreted as infinite width deep neural networks [Guss, 2017], constructed entirely by iterated composition of pointwise nonlinearities and linear integral operators along a directed acyclic computational graph. On a theoretical level, this formulation ensures commutativity of differentiation and sampling and so permits construction of approximate functional derivatives without performing direct manual calculations in infinite dimensional space. On a practical level, this permits implementation using existing fast and scalable libraries for automatic differentiation on Euclidean space while maintaining the consistency guarantees derived for solutions based on derivatives computed directly in infinite dimensional space in Childers [2018].
In addition to providing precise technical conditions under which this method yields accurate representations, I provide examples and guidelines for how to formulate models to ensure that these conditions are satisfied. These conditions are shown to hold in models which possess smooth conditional densities of idiosyncratic state variables as in the class of heterogeneous agent models formalized in Arellano and Bonhomme [2016] augmented with aggregate shocks, subject to a particular choice of representation of the model equations which can be implemented by a change of variables. Convergence rates for the approximation are derived, depending on the classes of functions defining the nodes in the network and the overall network topology for a variety of choices of interpolation method including polynomials, splines, histograms, and wavelets. The procedure is demonstrated numerically by application to a version of the incomplete markets model of Huggett [1993] with continuously distributed idiosyncratic and aggregate income risk.

An Averaging Estimator for Two Step M Estimation In Semiparametric Models

Ruoyao Shi
,
University of California-Riverside

Abstract

In this paper, we study the two step M estimation of a finite dimensional parameter which depends on a first step estimation of a potentially infinite dimensional nuisance parameter. We present an averaging estimator that combines a semiparametric estimator based on nonparametric first step and a parametric estimator which imposes parametric restrictions on the first step. The averaging weight is the sample analog of an infeasible optimal weight that minimizes quadratic risk functions. This averaging estimator strikes a balance between the robust semiparametric estimator and the efficient parametric estimator, as we show that the averaging estimator uniformly dominates the semiparametric estimator in terms of asymptotic quadratic risk regardless of whether the first step parametric restrictions hold or not. In particular, we prove that under certain sufficient conditions, the asymptotic lower bound of the truncated quadratic risk differences between the averaging estimator and the semiparametric estimator is strictly less than zero under a class of data generating processes that includes both correct specification and misspecification of the first step parametric restrictions, and the asymptotic upper bound is weakly less than zero.

Improved Estimation by Simulated Maximum Likelihood

Kirill Evdokimov
,
Pompeu Fabra University
Ilze Kalnina
,
North Carolina State University

Abstract

We propose a method that improves the Simulated Maximum Likelihood Estimator (SMLE). The method does not impose any additional assumptions on the model; instead, it makes a more efficient use of the information available in this framework. Our approach allows reducing the number of simulation draws $S$ without sacrificing the precision of the estimator. In particular, the method provides a semiparametrically efficient estimator when $sqrt{n}/S -> 0$, a situation in which SMLE is biased. Moreover, under certain smoothness restrictions, our estimator can be asymptotically efficient when $S$ is finite. The method should be most useful when the evaluation of the simulated likelihood function is computationally expensive.

Quantile Factor Models

Liang Chen
,
Shanghai University of Finance and Economics
Juan Dolado
,
University Carlos III of Madrid
Jesus Gonzalo
,
University Carlos III of Madrid

Abstract

Quantile Factor Models (QFM) represent a new class of factor models for high-dimensional panel data. Unlike Approximate Factor Models (AFM), where only mean-shifting factors can be extracted, QFM also allow to recover unobserved factors shifting other relevant parts of the distributions of observed variables. A quantile regression approach, labeled Quantile Factor Analysis (QFA), is proposed to consistently estimate all the quantile-dependent factors and loadings. Their asymptotic distribution is then derived using a kernel-smoothed version of the QFA estimators. Two consistent model selection criteria, based on information criteria and rank minimization, are developed to determine the number of factors at each quantile. Moreover, in contrast to the conditions required for the use of Principal Components Analysis in AFM, QFA estimation remains valid even when the idiosyncratic errors have heavy-tailed distributions. Three empirical applications (regarding macroeconomic, climate and finance panel data) provide evidence that extra factors shifting the quantiles other than the means could be relevant in practice.

On Testing Continuity and the Detection of Failures

Sida Peng
,
Microsoft Research
Matthew Backus
,
Columbia University

Abstract

Estimation of discontinuities is pervasive in applied economics: from the study of sheepskin effects to prospect theory and ``bunching" of reported income on tax returns, models that predict discontinuities in outcomes are uniquely attractive for empirical testing. However, existing empirical methods often rely on assumptions about the number of discontinuities, the type, the location, or the underlying functional form of the model. We develop a nonparametric approach to the study of arbitrary discontinuities --- point discontinuities as well as jump discontinuities in the $n$th derivative, where n=0,1,... --- that does not require such assumptions. Our approach exploits the development of false discovery rate control methods for lasso regression as proposed by G'Sell et al. (2015). This framework affords us the ability to construct valid tests for both the null of continuity as well as the significance of any particular discontinuity without the computation of nonstandard distributions. We illustrate the method with a series of Monte Carlo examples and by replicating prior work detecting and measuring discontinuities, in particular Lee (2008) Card et al. (2008), Reinhart and Rogoff (2010), and Backus et al. (2016).

Simultaneous Mean-Variance Regression

Richard Spady
,
Johns Hopkins University
Sami Stouli
,
University of Bristol

Abstract

We propose simultaneous mean-variance regression for the linear estimation and approximation of conditional mean functions. In the presence of heteroskedasticity of unknown form, our method accounts for varying dispersion in the regression outcome across the support of conditioning variables by using weights that are jointly determined with the mean regression parameters. Simultaneity generates outcome predictions that are guaranteed to improve over ordinary least-squares prediction error, with corresponding parameter standard errors that are automatically valid. Under shape misspecification of the conditional mean and variance functions, we establish existence and uniqueness of the resulting approximations and characterize their formal interpretation and robustness properties. In particular, we show that the corresponding mean-variance regression location-scale model weakly dominates the ordinary least-squares location model under a Kullback-Leibler measure of divergence, with strict improvement in the presence of heteroskedasticity. The simultaneous mean-variance regression loss function is globally convex and the corresponding estimator is easy to implement. We establish its consistency and asymptotic normality under misspecification, provide robust inference methods, and present numerical simulations that show large improvements over ordinary and weighted least-squares in terms of estimation and inference in finite samples. We further illustrate our method with two empirical applications to the estimation of the relationship between economic prosperity in 1500 and today, and demand for gasoline in the United States.

Estimation of a Nonparametric Model for Bond Prices from Cross-Section and Time Series Information

Bonsoo Koo
,
Monash University
Davide La Vecchia
,
University of Geneva
Oliver Linton
,
University of Cambridge

Abstract

We develop estimation methodology for an additive nonparametric panel model that is suitable for capturing the pricing of coupon-paying government bonds followed over many time periods. We use our model to estimate the discount function and yield curve of nominally riskless government bonds. The novelty of our approach is the combination of two different techniques: cross-sectional nonparametric methods and kernel estimation for time varying dynamics in the time series context. The resulting estimator is used for predicting individual bond prices given the full schedule of their future payments. In addition, it is able to capture the yield curve shapes and dynamics commonly observed in the fixed income markets. We establish the consistency, the rate of convergence, and the asymptotic normality of the proposed estimator. A Monte Carlo exercise illustrates the good performance of the method under different scenarios. We apply our methodology to the daily CRSP bond market dataset, and compare ours with the popular Diebold and Li (2006) method.

Identification of Treatment Effects with Mismeasured Imperfect Instruments

Desire Kedagni
,
Iowa State University

Abstract

In this article, I develop a novel identification result for estimating the effect of an endogenous treatment using a proxy of an unobserved imperfect instrument. I show that the potential outcomes distributions are partially identified for the compliers. Therefore, I derive sharp bounds on the local average treatment effect. I write the identified set in the form of conditional moments inequalities, which can be implemented using existing inferential methods. I illustrate my methodology on the National Longitudinal Survey of Youth 1979 to evaluate the returns to college attendance using tuition as a proxy of the true cost of going to college. I find that the average return to college attendance for people who attend college only because the cost is low is between 29% and 78%.

Robust Inference about Conditional Tail Features: A Panel Data Approach

Yuya Sasaki
,
Vanderbilt University
Yulong Wang
,
Syracuse University

Abstract

We develop a new extreme value theory for panel data and use it to construct asymptotically valid confidence intervals (CIs) for conditional tail features such as conditional extreme quantile and conditional tail index. As a by-product, we also construct CIs for tail features of the coefficients in the random coefficient regression model. The new CIs are robustly valid without parametric assumptions and have excellent small sample coverage and length properties. Applying the proposed method, we study the tail risk of the monthly U.S. stock returns and find that (i) the left tail features of stock returns and those of the Fama-French regression residuals heavily depend on other stock characteristics such as stock size; and (ii) the alpha's and beta's are strongly heterogeneous across stocks in the Fama-French regression. These findings suggest that the Fama-French model is insufficient to characterize the tail behavior of stock returns.

Bounds on Causal Effects in Continuous Instrumental Variable Models

Florian Gunsilius
,
Brown University

Abstract

A goal in the causal inference literature is to obtain a tractable procedure for estimating sharp bounds on causal effects that is flexible enough to incorporate structural assumptions into the estimation process in a simple way. Existing approaches either focus on binary treatments or are intractable in most practical settings, especially when the endogenous treatment is continuous. This article introduces a computationally tractable method which allows for continuous treatments, outcomes, and instruments. It considers the instrumental variable model as two dependent stochastic processes and constructs an infinite dimensional linear program on their paths, the solution to which provides the counterfactual bounds. This framework is the natural analogue of the complier-, defier-, never taker-, always taker distinction in the continuous setting and allows to incorporate structural assumptions on the model in a unified manner. As a proof of concept, we apply it to obtain bounds on distributional causal effects of expenditures on leisure and food, using the 1996 UK Family Expenditure Survey. It yields the expected result that food is a necessity- and leisure is a luxury good under much weaker assumptions than previous methods.

Dynamic Foundations for Empirical Static Games

Niccolo Lomys
,
Toulouse School of Economics
Lorenzo Magnolfi
,
University of Wisconsin-Madison
Camilla Roncoroni
,
University of Warwick

Abstract

We propose a simple estimation strategy when data on strategic interaction are interpreted as the long-run result of a history of game plays. Players interact repeatedly in an incomplete information game, possibly while learning how to play in such a game. We remain agnostic on the details of the learning process and only impose a minimal behavioral assumption describing an optimality condition for the long-term outcome of players' interaction. In particular, we assume that play satisfies a property of asymptotic no regret" (ANR). This property requires that the time average of the counterfactual increase in past payoffs, had different actions been played, becomes approximately zero in the long run. A large class of well-known algorithms for the repeated play of the incomplete information game satisfies the ANR property. We show that, under the ANR assumption, it is possible to partially identify the structural parameters of players' payoff functions. We establish our result in two steps. First, we prove that the time average of play that satises ANR converges to the set of Bayes correlated equilibria of the underlying static game. To do so, we extend to incomplete information environments prior results on dynamic foundations for equilibrium play in static games of complete information. Second, we show how to use the limiting model to obtain consistent estimates of the parameters of interest. We apply our method to data on pricing behavior in an online platform.

Identification of Auction Models Using Order Statistics

Yao Luo
,
University of Toronto
Ruli Xiao
,
Indiana University

Abstract

Auction data often fail to record all bids or all relevant factors that shift bidder values, which complicates identification of the underlying value distribution. This paper considers identification of independent private value auctions with discrete unobserved heterogeneity (UH) using multiple order statistics of bids. Usual measurement error approaches are inapplicable due to dependence among order statistics. We provide a set of positive results. First, we show that models with nonseparable finite UH are identifiable using three consecutive order statistics or two consecutive ones with an instrument. Second, two arbitrary order statistics identify models with nonseparable finite UH if UH provides support variations. Lastly, we apply our methods to U.S. Forest Service timber auctions and find that ignoring UH reduces both bidder and auctioneer surplus.

Synthetic Regression Discontinuity-Estimating Treatment Effects using Machine Learning

Pietro Bonaldi
,
Carnegie Mellon University
Jörn Boehnke
,
University of California-Davis

Abstract

In the standard regression discontinuity setting, treatment assignment is based on whether a
unit’s observable score (running variable) crosses a known threshold. We propose a two-stage method
to estimate the treatment effect when the score is unobservable to the econometrician while the
treatment status is known for all units. In the first stage, we use a statistical model to predict
a unit’s treatment status based on a continuous synthetic score. In the second stage, we apply a
regression discontinuity design using the predicted synthetic score as the running variable to estimate
the treatment effect on an outcome of interest. We establish conditions under which the method
identifies the local treatment effect for a unit at the threshold of the unobservable score, the same
parameter that a standard regression discontinuity design with known score would identify. We also
examine the properties of the estimator using simulations, and propose the use machine learning
algorithms to achieve high prediction accuracy. Finally, we apply the method to measure the effect
of an investment grade rating on corporate bond prices by any of the three largest credit ratings
agencies. We find an average 1% increase in the prices of corporate bonds that received an investment
grade as opposed to a non-investment grade rating.

On Counterfactual Analysis of Differentiable Functionals

Yaroslav Mukhin
,
Massachusetts Institute of Technology

Abstract

Counterfactual probability distributions are important elements of policy analysis, Oaxaca-Blinder style decomposition analysis, robustness and sensitivity analysis in empirical economics. In this paper we solve two complementary problems of statistical counterfactual analysis: (i) Given a counterfactual change in a scalar functional of a probability distribution, we describe the counterfactual distributions that have such an effect on the functional and deviate minimally from the status quo distribution in a continuous fashion. (ii) Given a counterfactual distribution, we compute the change in a statistical functional relative to the status quo distribution by integrating its local changes along a path from the status quo to the counterfactual distribution. In combination, these two exercises provide a general framework for measuring the local and global relationship between structural parameters or counterfactuals and descriptive statistics or specific features of the data. To solve these problems, we use von Mises calculus (i.e. influence functions), information geometry, optimal transport, and introduce gradient score flows. We define a unique path of counterfactual distributions with a combination of a statistical functional and a metric of distance or cost on the nonparametric manifold of probability distributions via the gradient flow of the functional. We describe the gradient flow paths obtained with the Fisher-Rao information metric, 2-Wasserstein optimal transport metric, and their weighted variants.

Blacklisted

John Lazarev
,
University of Pennsylvania

Abstract

A firm can refuse to sell to a consumer for any reason or no reason at all. The paper investigates the link between a firm's incentives to blacklist a customer and market structure. The paper shows that private incentives to blacklist differ from socially optimal ones. The former trades off forfeited profits from ``false positives'' against incurred losses from ``false negatives,'' while the latter also takes consumer surplus into account. Competition may reduce social losses from blacklisting profitable customers by mistake, yet extensive data sharing creates an impediment. Competition is especially important when decisions to blacklist are made by platforms, which is a novel concern for antitrust enforcement.

A Markov Switching Cointegration Analysis of the CDS‐Bond Basis Puzzle

Massimo Guidolin
,
Bocconi University
Manuela Pedio
,
Bocconi University

Abstract

We investigate the long-run equilibrium relationship between credit default swap (CDS)
premia and bond spreads for 65 U.S. corporate entities and 6 major banks over the period April 2011 – February 2018. Standard regression methods reveal that in 40 out of 71 entities, the two series fail to be cointegrated, which is puzzling because it represents a violation of the efficient market hypothesis. Differently from the previous literature, we estimate a Markov switching vector error correction model to capture the fact that regimes may be characterize the adjustment mechanism and the short-term dynamics between the spread series. We use this framework to investigate how the two markets contribute to the price discovery of credit risk. We find that there are many entities that switch between a regime where the adjustment to the long-run cointegrating relationship does not take place and a regime where one between the CDS or the bond markets leads the price discovery. Finally, we investigate whether the industry or other firm characteristics, such as the rating, the leverage, and the capital structure, may be relevant to determine which of the two markets is leads in the price discovery process.
JEL Classifications
  • C1 - Econometric and Statistical Methods and Methodology: General