Using geo-spatial data for the impact space
One of my favorite activities is looking at maps as they tell us a lot about our world.
In recent years, we have also seen the emergence of something called “Open-Source Intelligence”. Individuals have started to use satellite data, social media images or public registries. It is well documented in the book “We Are Bellingcat: An Intelligence Agency for the People”.
For example, if you are following the war in Ukraine you might have heard of NASA’s FIRMS which stands for Fire Information for Resource Management System. That is a very impressive tool to check developments in a war with little independent information provided. Even without knowing anything about the war you might guess where the fighting takes place.
What are some interesting examples and what do they tell us?
Let us now move from war to something closer to social impact and look at 5 different cases:
The examples show a small sample of what is available and what can be used for impact purposes.
Intra-urban temperatures and income levels
In a recent paper, researchers around Alby Duarte Rocha from TU Berlin have written on “Unprivileged groups are less served by green cooling services in major European urban areas”. They were mapping income levels with green cooling services.
Green cooling services create urban cool islands and are generated through tree shades or evaporation. These green cooling services can lower air temperatures for the local communities. In other words, a lack of green cooling services leads to higher temperatures.
In the paper, they have shown maps for several European cities including Vienna, Berlin, Paris and London. These maps contain great information. For example, you can identify urban cool and heat islands. This is great information to identify areas for interventions.
?
Crops
Can you tell the difference between wheat, rye and oat in the image below?
There is new research which aims to identify crops using satellite data.
Take a look at the image below created in the EuroCrops project. They have collected the data which can be used to train machine learning algorithms to identify crops. The dataset is quite impressive:
It comprises 706 683 multi-class labeled data points across 176 classes, featuring annual time series of per-parcel median pixel values from Sentinel-2 L1C data for 2021, along with crop type labels and spatial coordinates. Based on the open-source EuroCrops collection, EuroCropsML is publicly available on Zenodo.
NASA is working on a similar data project with a focus on four primary crop types (wheat, maize, rice, and soy). Their data is showing the conditions of the crop which is important as an early assessment can lead to better interventions and planning when crop conditions are well below average.
Poor: Crop conditions are well below average. Crop yields are likely to be 5% below average. This is only used when conditions are not likely to be able to recover, and impact on yields is likely.??????????
Satellite data can thus save lives.
Population density
The maps of population density show impressively the difference between different datasets and the methods to build them.
One dataset is provided by Eurostat. It includes the following variables and is based on the 2021 Census:
It is thus an exact measure of the European population based on census data.
?Compare this data to a dataset based on machine learning. Data for Good at Meta has used machine learning to estimate population density in grids of 30 x 30 meters (check out the paper if you are interested in the methodology). The differences between the census data and the AI-based data shows the potential for machine learning across all domains.
Differences between day and night
There are also some interesting datasets which measure the economic activity as well as population mobility.
A team of researchers has put together different datasets to calculate the proportion of people in a certain area at day and nighttime. Their paper is suitably titled “Uncovering temporal changes in Europe’s population density patterns using a data fusion approach”. The image below shows the example of Paris, Lisbon and Milan and the differences between daytime and nighttime. The row above shows the nighttime and the row below shows the daytime activities.
领英推荐
It is a multi-dimensional map and also need to consider seasonal tourist flows which are visible in Southern Portugal or Disneyland Paris:
For example, Paris, France, is characterized by a net gain in population in daytime in a rather large area corresponding to the city core, resulting from a large concentration of economic activities, surrounded by a belt of predominantly residential areas that lose population in daytime. Although much smaller, the city of Lisbon shows a similar pattern, whereas in Milan the areas with higher population densities in daytime appear more scattered. Differences between August and January also have very distinct spatial patterns. The historical core of Paris clearly gains population in August compared to January, whereas most of its surroundings display a net loss. Some positive hotspots are visible in areas such as the Charles De Gaulle airport and Disneyland. In the south of Portugal, population in August outweighs the population in January, both in the historical center of Lisbon and the in southernmost coastal areas of Algarve. Finally, in the North of Italy, all the Milan metropolitan area loses population in August, whereas gains are observed around the lakes Maggiore, Como, and, even more noticeably, Garda.
Deforestation
You can also use satellite data to see the amount of deforestation over time. Global Forest Watch is providing the information in the image below. Again, it shows the area around Vienna where I am living.
There are plenty of other layers including forest carbon removals or biodiversity areas. Global Forest Watch also has a layer showing the biodiversity intactness across many areas of the earth. A high biodiversity intactness implies a minimal human interference with nature.
?
Characteristics of geo-spatial data
Let us discuss the somewhat surprising aspects of geo-spatial data.
Open-source nature
The industry is built on open access and open-source algorithms. Let me share two examples beyond the examples we have already discuss above.
A few researchers based in Brazil, France, Australia and Japan have developed “Deep Wealth” which calculates an wealth index with earth observation data and machine learning algorithms. They have provided all the data, source codes and metadata on Github.
That means that everyone can use their models to calculate wealth indices such as those for Madagascar in the example below.
?
The same applies for one of the leading weather forecast models developed by Google AI which is named NeuralGCM. GCM is an abbreviation for General Circulation Models. Again, all relevant files are uploaded on Github.
Earth observation data as common good
Many agencies supply the data as a common good.
The European Space Agency operates different satellites. Sentinel-1 provides radar images, while Sentinel-2 offers high-resolution optical images for land cover classification, while Sentinel-3 focuses on providing data on sea surface temperature and ocean color. There are also other satellites which are measuring atmospheric chemistry data or sea level height.?The same applies for NASA which is providing a range of interesting data sets.
That means that the information about our world is provided as a publicly available service. We should make better use of the data.
However, a lot of high-quality satellite imagery is still prohibitively expensive. Images with a 30 cm resolution can cost up to a few thousand Euros as there is a minimum order quantity for areas as large as 100 square kilometers.
Mostly limited to specialists
It is a great opportunity to use the available data for impact-related areas, but it is mostly limited to specialists.
There are many nuances when it comes to satellite data requiring substantial computational power. For example, Sentinel-2 uses 13 spectral bands in different resolutions to map the world. These can all be used for different purposes and use cases.
There is also the requirement to manage large datasets which need to be continuously updated.
In addition, global analysis requires an understanding of local aspects. For example, take the green cooling services. It makes perfect sense for many European cities but for cities like Rio de Janeiro, Tehran or Las Vegas, it makes more sense to look at air conditioning, architectural differences and human habits.
Outlook
A short side note. Artificial intelligence has been around for some 70 years, but it only took off with the availability of computing power, cloud-based data infrastructure and very large data pools (the internet) since the 2000s.
We can see a similar pattern for geo-spatial data. We have very large (and publicly available) datasets, open-source software and the necessary computing power. It is surprising that we still do not have the platform analytics for the social economy as there are many use cases.
Let us imagine that you are funding interventions to get people back into employment. You might want that the interventions take place in areas which have less economic resources than other areas of the city. That would be quite straightforward with geo-spatial data which go beyond current analysis which is based on postal codes.
Let us imagine that you are funding the installation of solar panels. You could use satellite data to check the existence of solar panels and can also use the data for auditability purposes. It might be a bit more expensive as you pay €3,000 for high-resolution images covering 100 square kilometers but that can still be feasible in densely populated areas.
What does that mean for impact data platforms?
Imagine the following: You know the income distribution in a certain block together with average temperature and the number of people living there. Now you can compare every block of a city. What could you do with this information?
Imagine another case: You can forecast the weather for a certain rural in 2050 and already which crops they are growing there. What could you do with this information?
There are plenty of use cases but still a lack of platforms to make use of this information. Time to change it!
Geospatial Data Scientist| GIS Analyst at UNDP HQ| Passionate about getting insights from geospatial data for better and sustainable future
6 个月Interesting and very useful Article. Thanks for sharing your thoughts.