« Back to Results

Data Science and AI: The Next Frontier for Evidence-Based Policy-Making

Paper Session

Saturday, Jan. 4, 2020 8:00 AM - 10:00 AM (PDT)

Manchester Grand Hyatt, Gaslamp AB
Hosted By: International Trade and Finance Association
  • Chair: Thierry Warin, SKEMA Business School

From Public Data to Responsible A.I.: The Next Generation of Public Data Distribution Platforms

Cesar A. Hidalgo
,
Massachusetts Institute of Technology

Abstract

In recent years advances in big data and algorithms have given rise to a world in which it is finally possible to include algorithmic decision making in the decision pipelines of governments and businesses. But how should governments and companies organize and communicate their data? How and when should they include A.I. in their decision-making process? And how will their employees and customers react to the inclusion of A.I. in their organizations? In this presentation I will discuss recent advances in the creation of data integration, distribution, and visualization tools designed to augment the decision-making pipelines of organizations. These tools include A.I. concepts and techniques that have become relevant for our understanding of economic development, and have opened the door to new techniques to collect and distribute public data. I show how these tools are paving the way for A.I. to become an integral part of an organization, and conclude by showing empirical work documenting differences on how people judge human and A.I. actions.

On the Measurement of Public Opinion in the Age of Big Data

Clifton van der Linden
,
McMaster University

Abstract

Democracy is predicated on the idea that governments are responsive to the publics which they are elected to represent. In order for elected representatives to govern effectively, they require reliable measures of public opinion. Traditional sources of public opinion research are increasingly complicated by the expanding modalities of communication and accompanying cultural shifts. Diversification of information and communications technologies as well as a steep decline in survey response rates is producing a crisis of confidence in conventional probability sampling. An increasingly rich yet relatively untapped source of public opinion takes the form of extraordinarily large, complex datasets commonly referred to as Big Data. Artificial Intelligence, and machine learning in particular, offers new opportunities for addressing the challenges for statistical inference as they pertain to Big Data, not least of which is that these data typically take the form of non-probability sample. This paper argues that, under specific circumstances and given the application of machine learning techniques, certain types of non-probability sample may be capable of yielding reliable inferences about a population of interest. To demonstrate this argument, it analyzes the inferences derived from the most extraordinary probability and non-probability samples collected during the 2015 Canadian federal election campaign—the Canadian Election Study (CES) and Vote Compass, respectively. It uses the election outcome as a benchmark and models the observations collected from each sample to assess how accurately they are able to the forecast the distribution of the vote.

Health Outcomes in China for the Hukou Migrants: How Algorithms May Inform Public Policymakers

Marta Bengoa
,
City University of New York-City College
Thierry Warin
,
SKEMA Business School

Abstract

We use Machine Learning techniques to question whether hukou status plays a role in the health outcome of migrant workers. We use survey data reported in the Longitudinal Survey on Rural Urban Migration in China from the Institute for the Study of Labor (IZA). The survey collects data for 71,074 individuals (29,556 urban persons; 32,171 rural persons; and 9,347 migrants. Approx 29% of rural persons) in two waves for the years 2008 and 2009. The survey contains data on socioeconomic indicators, such as education, income, ethnicity, and hukou registration. The RUMiC survey also includes data on many health indicators and outcomes. When using objective health measures the effect increases in magnitude and significance, but tends to disappear as migrants remain in the urban cities, suggesting a network effect of informal access to health providers, increase in incomes and access to private health. Migration will require adjustments in health provisions to accommodate the changing spatial demographics. Restricting migrants access to healthcare will clearly have an effect in the long run, including on migrant’s health, productivity, and potential economic growth.

Evidence-Based Health and Environmental Policies and the Potential Mismatch with Citizens' Perceptions: A Data Science Perspective

Ann Backus
,
Harvard University
Nathalie de Marcellis
,
Polytechnique Montreal & CIRANO (Montreal)

Abstract

In this paper, we map the conversations on Twitter about Shale gas and Fracking in the US. Not only the data (content of the tweets), but also the metadata are interesting. Indeed, the content allows us to do some semantic analysis and thus maps positive and negative comments and topic of the discussion (health impacts, environmental impacts, etc.). With the metadata, we can for instance map with latitude and longitude data where tweets originate and compare to the shale plays' location. We can thus add a spatial dimension to the conversations. The results are particularly interesting in the context of the design of public policies, notably when in comes to health and environmental policies.
Discussant(s)
Joseph Pelzman
,
George Washington University
Gina Pieters
,
University of Chicago
Aleksandar Stojkov
,
Ss. Cyril and Methodius University
Cesar A. Hidalgo
,
Massachusetts Institute of Technology
JEL Classifications
  • H1 - Structure and Scope of Government
  • Y1 - Data: Tables and Charts