High-Dimensional Econometrics and Machine Learning
Paper Session
Sunday, Jan. 3, 2021 3:45 PM - 5:45 PM (EST)
- Chair: Mikkel Soelvsten, University of Wisconsin-Madison
Sparse Quantile Regression
Abstract
We estimate a quantile regression model with a penalty on the number of selected covariates. We derive probability bounds on the estimated sparsity as well as probability and expectation bounds on the excess quantile prediction risk and the mean-square parameter estimation error of our proposed estimator. These theoretical results are non-asymptotic and established in a high-dimensional setting. In particular, we show that our method yields a sparse estimator whose L0-norm can be close to true sparsity with high probability and obtain the oracle rates of convergence for the excess prediction risk and the mean-square parameter estimation error. We implement the proposed procedure via the method of mixed integer linear programming and also a more scalable first-order approximation algorithm. The finite-sample numerical performance is illustrated in Monte Carlo experiments.Testing Many Restrictions Under Heteroskedasticity
Abstract
We propose a hypothesis test that allows for many tested restrictions in a heteroskedastic linear regression model. The test compares the conventional F-statistic to a critical value that corrects for many restrictions and conditional heteroskedasticity. The correction utilizes leave-one-out estimation to recenter the conventional critical value and leave-three-out estimation to rescale it. Large sample properties of the test are established in an asymptotic framework where the number of tested restrictions may grow in proportion to the number of observations. We show that the test is asymptotically valid and has non-trivial asymptotic power against the same local alternatives as the exact F test when the latter is valid. Simulations corroborate the relevance of these theoretical findings and suggest excellent size control in moderately small samples also under strong heteroskedasticity.Inference for High-Dimensional Exchangeable Arrays
Abstract
For multiway cluster sampled data and dyadic data, we develop novel bootstrap methods and theories for inference about multi- dimensional, increasing-dimensional and high-dimensional parameters. Based on non-asymptotic Gaussian approximation error bounds for the test-statistic on hyper-rectangles, we propose novel bootstrap methods and establish their finite sample validity. We illustrate applications of our proposed methods to robust inference in demand analysis, robust inference in extended gravity analysis, and construction of uniform confidence bands for densities of migration and trade.Inference for Heterogeneous Treatment Effects for Observational Data with High-Dimensional Covariates
Abstract
We consider heterogeneous treatment effects on a set of high-dimensional covariates for observational data without the strong ignorability assumption (Rosenbaum and Rubin, 1983). With a binary instrumental variable, the parameters of interest are identifiable on an unobservable subgroup (compliers) of the population through a two-stage regression model. The Lasso estimation under a non-convex objective function is developed for the two-stage regression. Its de-sparsifying estimator and the inference procedure are proposed. The confidence interval for the treatment effect given specific covariates is also constructed. The proposed approach works for both continuous and categorical response variables under the framework of generalized linear models. Theoretical properties of the proposed method are derived, and simulation studies are conducted to evaluate its performance. A real data analysis on the Oregon Health Insurance Experiment is performed to illustrate the utility of the proposed method in practice.JEL Classifications
- C1 - Econometric and Statistical Methods and Methodology: General
- C5 - Econometric Modeling