« Back to Results

High-Dimensional Econometrics and Machine Learning

Paper Session

Sunday, Jan. 3, 2021 3:45 PM - 5:45 PM (EST)

Hosted By: Econometric Society

Chair: Mikkel Soelvsten, University of Wisconsin-Madison

Shapes as Product Differentiation: Neural Network Embedding in the Analysis of Markets for Fonts

Sukjin Han

University of Texas-Austin

Eric Schulman

University of Texas-Austin

Abstract

Many products have key attributes that are high dimensional (e.g., design, text). Quantifying these attributes is important for economic analysis. This paper considers one of the simplest design products, fonts, and quantifies their shape by constructing their embeddings using a modern convolutional neural network. The embedding maps fonts’ shapes onto a low dimensional vector. Importantly, we verify the resulting embed- ding is economically meaningful by showing that mutual information is high between the embedding and descriptions assigned to each font by font designers and consumers. We illustrate the usefulness of the embeddings by a simple trend analysis of font style.

Sparse Quantile Regression

Le-Yu Chen

Academia Sinica

Sokbae (Simon) Lee

Columbia University

View Abstract

Abstract

We estimate a quantile regression model with a penalty on the number of selected covariates. We derive probability bounds on the estimated sparsity as well as probability and expectation bounds on the excess quantile prediction risk and the mean-square parameter estimation error of our proposed estimator. These theoretical results are non-asymptotic and established in a high-dimensional setting. In particular, we show that our method yields a sparse estimator whose L0-norm can be close to true sparsity with high probability and obtain the oracle rates of convergence for the excess prediction risk and the mean-square parameter estimation error. We implement the proposed procedure via the method of mixed integer linear programming and also a more scalable first-order approximation algorithm. The finite-sample numerical performance is illustrated in Monte Carlo experiments.

Testing Many Restrictions Under Heteroskedasticity

Stanislav Anatolyev

CERGE-EI and New Economic School

Mikkel Soelvsten

University of Wisconsin-Madison

View Abstract

Abstract

We propose a hypothesis test that allows for many tested restrictions in a heteroskedastic linear regression model. The test compares the conventional F-statistic to a critical value that corrects for many restrictions and conditional heteroskedasticity. The correction utilizes leave-one-out estimation to recenter the conventional critical value and leave-three-out estimation to rescale it. Large sample properties of the test are established in an asymptotic framework where the number of tested restrictions may grow in proportion to the number of observations. We show that the test is asymptotically valid and has non-trivial asymptotic power against the same local alternatives as the exact F test when the latter is valid. Simulations corroborate the relevance of these theoretical findings and suggest excellent size control in moderately small samples also under strong heteroskedasticity.

Inference for High-Dimensional Exchangeable Arrays

Harold Chiang

Vanderbilt University

Kengo Kato

Cornell University

Yuya Sasaki

Vanderbilt University

View Abstract

Abstract

For multiway cluster sampled data and dyadic data, we develop novel bootstrap methods and theories for inference about multi- dimensional, increasing-dimensional and high-dimensional parameters. Based on non-asymptotic Gaussian approximation error bounds for the test-statistic on hyper-rectangles, we propose novel bootstrap methods and establish their finite sample validity. We illustrate applications of our proposed methods to robust inference in demand analysis, robust inference in extended gravity analysis, and construction of uniform confidence bands for densities of migration and trade.

Inference for Heterogeneous Treatment Effects for Observational Data with High-Dimensional Covariates

Jing Tao

University of Washington

Abstract

We consider heterogeneous treatment effects on a set of high-dimensional covariates for observational data without the strong ignorability assumption (Rosenbaum and Rubin, 1983). With a binary instrumental variable, the parameters of interest are identifiable on an unobservable subgroup (compliers) of the population through a two-stage regression model. The Lasso estimation under a non-convex objective function is developed for the two-stage regression. Its de-sparsifying estimator and the inference procedure are proposed. The confidence interval for the treatment effect given specific covariates is also constructed. The proposed approach works for both continuous and categorical response variables under the framework of generalized linear models. Theoretical properties of the proposed method are derived, and simulation studies are conducted to evaluate its performance. A real data analysis on the Oregon Health Insurance Experiment is performed to illustrate the utility of the proposed method in practice.

JEL Classifications

C1 - Econometric and Statistical Methods and Methodology: General
C5 - Econometric Modeling

This website uses cookies.

High-Dimensional Econometrics and Machine Learning

Sunday, Jan. 3, 2021 3:45 PM - 5:45 PM (EST)

Shapes as Product Differentiation: Neural Network Embedding in the Analysis of Markets for Fonts

Abstract

Sparse Quantile Regression

Abstract

Testing Many Restrictions Under Heteroskedasticity

Abstract

Inference for High-Dimensional Exchangeable Arrays

Abstract

Inference for Heterogeneous Treatment Effects for Observational Data with High-Dimensional Covariates

Abstract

JEL Classifications