Empirical Methods
Paper Session
Friday, Jan. 5, 2024 10:15 AM - 12:15 PM (CST)
- Chair: Wayne Yuan Gao, University of Pennsylvania
Log-like? Identified ATEs Defined with Zero-Valued Outcomes Are (Arbitrarily) Scale-Dependent
Abstract
Economists frequently estimate average treatment effects (ATEs) for transformations of the outcome that are well-defined at zero but behave like log(y) when y is large (e.g., log(1+y), arcsinh(y)). We show that these ATEs depend arbitrarily on the units of the outcome, and thus should not be interpreted as percentage effects. In line with this result, we find that estimated treatment effects for arcsinh-transformed outcomes published in the American Economic Review change substantially when we multiply the units of the outcome by 100 (e.g., convert dollars to cents). To help delineate alternative approaches, we prove that when the outcome can equal zero, there is no parameter of the form E[g(Y(1),Y(0))] that is point-identified and unit-invariant. We conclude by discussing sensible alternative target parameters for settings with zero-valued outcomes that relax at least one of these requirements.When Are Estimates Independent of Measurement Units?
Abstract
Data transformations often facilitate regression analysis, yet many commonly used transformations make hypothesis testing misleading because the results depend on the measurement units of the data. This paper establishes a six-way equivalence that fully characterizes the set of transformations for which the conclusions are independent of measurement units. Central to these results are the concepts of scale equivariance (that any scaling of the data can be reversed by appropriately rescaling the estimator) and scale invariance (that any scaling of the data does not affect the estimator). The equivalence result demonstrates that desirable properties such as scale-equivariant coefficient estimates, scale-invariant $t$-statistics, and scale-invariant semi-elasticities arise if and only if the transformation is a logarithmic or a power function. Power transformations thus provide a natural extension of logarithmic transformations that both preserves the essential feature of obtaining unit-independent estimates for unitless quantities of interest and is defined at zero. Popular alternatives that approximate the shape of the logarithmic function at large values, such as adding a small positive constant before applying a logarithmic transformation or the inverse hyperbolic sine transformation, can result in arbitrarily large semi-elasticity estimates and can change sign and statistical significance depending on the choice of measurement units, which we highlight both theoretically and empirically.Optimal Categorical Instrumental Variables
Abstract
This paper discusses estimation with a categorical instrumental variable in settings with potentially few observations per category. The proposed categorical instrumental variable estimator (CIV) leverages a regularization assumption that implies existence of a latent categorical variable with fixed finite support achieving the same first stage fit as the observed instrument. In asymptotic regimes that allow the number of observations per category to grow at arbitrary small polynomial rate with the samplesize, I show that when the cardinality of the support of the optimal instrument is known, CIV is root-n asymptotically normal, achieves the same asymptotic variance as the oracle IV estimator that presumes knowledge of the optimal instrument, and is semiparametrically efficient under homoskedasticity. Under-specifying the number of support points reduces efficiency but maintains asymptotic normality.
Two-Stage Maximum Score Estimator
Abstract
This paper considers the asymptotic theory of a semiparametric M-estimator that is generally applicable to models that satisfy a monotonicity condition in one or several parametric indexes. We call this estimator the two-stage maximum score (TSMS) estimator, since our estimator involves a first-stage nonparametric regression when applied to the binary choice model of Manski (1975, 1985). We characterize the asymptotic distribution of the TSMS estimator, which features phase transitions depending on the dimension of the first-stage estimation. We show that the TSMS estimator is asymptotically equivalent to the smoothed maximum-score estimator (Horowitz, 1992) when the dimension of the first-step estimation is relatively low, while still achieving partial rate acceleration relative to the cubic-root rate when the dimension is not too high. Effectively, the first-stage nonparametric estimator serves as an imperfect smoothing function on a non-smooth criterion function, leading to the pivotality of the first-stage estimation error with respect to the second-stage convergence rate and asymptotic distribution.JEL Classifications
- C2 - Single Equation Models; Single Variables