Where are all the jobs? A machine learning approach for high resolution urban employment prediction – World Bank Group

Detailed data on the spatial distribution of jobs is crucial. It enables urban planners and developers to identify economic hubs within a city and take targeted measures to improve their productivity, connectivity, and resilience . For example, measures such as investing in infrastructure upgrades or flood protection systems, enhancing commuting options, and adapting urban planning decisions, can support firms and yield city-wide benefits for the lives and livelihoods of workers and their communities.  
And yet, few cities have high resolution data on jobs, especially in developing countries. In practice, business registries, employment censuses, or travel surveys are the most common sources for mapping the density and spatial distribution of jobs within a city. But such data are rarely available; and when they do exist, they tend to be incomplete, unreliable, or outdated. As a result, urban planning and investment decisions often rely on patchy, outdated or anecdotal information that cannot efficiently target employment centers.  
Recent initiatives have successfully leveraged mobile phone-derived data to document “meaningful” locations, including jobs. This is a breakthrough, especially given the increasingly ubiquitous use of mobile phones . And yet, despite fast progress, accessing and processing mobile phone data remains a difficult, lengthy, and often costly process. Quick to deploy, cheap and robust alternatives are sorely needed. 
In a new study and its companion demonstration note, supported by the Global Facility for Disaster Reduction and Recovery, we develop a machine learning algorithm that relies on widely available public data to predict the spatial distribution of employment in urban areas at high-resolution. We train and test the algorithm on 14 cities in Sub-Saharan Africa and Latin America, for which survey-based observed employment data was available. These cities range from Abidjan, Ivory Coast to Nairobi, Kenya in Sub-Saharan Africa; and from Belo Horizonte, Brazil to Mexico City, Mexico in Latin America. When comparing our predicted employment maps against employment observations, we find very robust performance of the prediction algorithm, with R2, that is the goodness of fit measure, averaging 0.63 and reaching values up to 0.8. In cities where existing employment maps are coarse, the algorithm may in fact offer more granular insights (Table 1).  
Sub-Saharan Africa
Dar Es Salaam
Latin America
Belo Horizonte
Buenos Aires
Mexico City
Table 1: Performance (R2) of the machine learning algorithm when comparing observed and predicted spatial distribution of jobs in 14 cities across Sub-Saharan Africa and Latin America 
We demonstrate that it is a scalable, quick, and low-cost approach that can yield high-resolution job density maps, which can be used in the absence of alternative official data and offer highly detailed insights into the spatial structure of urban economies (Figure 1).   
Figure 1
Figure 1: Observed and predicted employment density in urban Buenos Aires, Argentina (R2=0.78). Employment data source: LOGIT (2012), based on 2004/5 Censo Nacional Ecónomico/The Argentinian National Institute of Statistics and Census (INDEC) and 2011 Encuesta Permanente de Hogares/INDEC.
Note: R2, or goodness of fit measure, indicates the share of variation in the observed employment density that can be explained by the algorithm.
The idea behind the method is simple: locations within a city that have higher concentrations of amenities (such as restaurants but also ATM machines and schools), road intersections, public transport stops, or that display more intense nighttime lights, among other features, are more likely to be hubs of economic activity and employment. Conversely, terrain roughness, water bodies, and vegetation indices are likely to be negatively correlated to the presence of jobs. We implement this notion through a machine learning algorithm that takes into account an array of data sources extracted from OpenStreetMap and Google Earth Engine.
High resolution employment maps open a host of possibilities for operational and analytical applications. For instance, such maps allow for more systematic employment accessibility analyses in cities , with the goal of assessing and improving the effectiveness of urban transport investments and land use interventions. This means that urban planning and investment decisions can specifically target weak points and bottlenecks that hold back urban prosperity and resilience.
Employment data could also help increase the development and application of quantitative spatial economic models, which can be used to identify and prioritize effective economic policies. And high-resolution employment data could constitute a key piece of the puzzle in understanding agglomeration forces in developing country cities and their link with urban spatial layouts.  
With the validation exercise giving us confidence in the approach’s ability to approximate for the distribution of jobs in urban areas, we are currently scaling up. While we started with comparing our results with observations in 14 cities, we are now focusing on building a library of employment prediction maps for a thousand cities in developing countries (see examples in Figure 2). This library will be made publicly available, following basic validation steps.
Figure 2
Figure 2: Employment predictions across various cities 
Senior Economist, Global Facility for Disaster Reduction and Recovery (GFDRR), World Bank
Senior Economist
Researcher at the Oxford Martin School at the University of Oxford/UK
Associate Professor and Acting Director of Research at the Centre for Advanced Spatial Analysis (CASA) at University College London
This site uses cookies to optimize functionality and give you the best possible experience. If you continue to navigate this website beyond this page, cookies will be placed on your browser. To learn more about cookies, click here.


Leave a Comment