登录查看更多内容

USA Urban Tree Cover with Nearmap: Technical Back Story

Dr. Michael Bewley

Mapping the evolution of cities with petabyte scale deep learning on geospatial imagery.

发布日期: 2023年7月7日

It's nearly a year and a half since I posted the Australian National Tree Cover Analysis Back Story. That's a long time between stories - what have we been up to? In the interim, we've been busy building and rolling out a slew of new products that are built on top of Nearmap AI Gen 5 (packing in around 80 semantic segmentation layers into our model, and massively increasing our training set size).

I also squeezed in the time to do a decade long longitudinal study of tree canopy in the city of Adelaide in a four part blog series (including tracking the story of several suburbs over a decade with some fun visualisations, a qualitative comparison with LiDAR and a quantitiative comparison with both LiDAR and human expert labels).

I wanted to take things to the next level, and expand both geographically, and in terms of scale. For that, I took things state side. Where Australia's population is around 26 million, the United States of America is home to over 331 million residents, according to the freshly released 2020 census data. That's a 12-13x scale up. The USA also involved some new challenges, such as a different census data set, and far higher prevalence of deciduous trees (which we often capture deliberately in a "leaf off" state).

Methodology

I wanted to be as consistent as I could with our work on Australia. As a quick summary, this meant:

Using the "Gen 4" version of "Medium/High Vegetation" as the tree canopy layer (which uses identical code, even down to the deep learning model, as the Australian data)
A focus on tree canopy, buildings and population data, to observe how the three are distributed relative to each other.
A single time point analysis (longitudinal is possible, but more work than I could commit to for this study!)
Combining and summarising at the lowest census level (mesh blocks in Australia, vs census blocks in USA)

Scale

The most obvious difference between the two studies was scale. To make the problem of analysing such a large amount of data practical, I moved on from "Nellie", the old faithful workstation, with a nostalgic sigh, and switched across to SageMaker Studio. While it would have been technically possible to re-engineer the analysis code to work on Nellie (computing in parallel for longer, and chunking to avoid memory limits), there was something seductively simple about dialling up a 96 core, 768 GB RAM monster of a machine at the click of a button for the more intensive parts of the analysis, and falling back to something smaller and cheaper for working on final visualisations and statistics.

USA 2020 Census

At the time of performing the analysis, the USA census was updated with 2020 statistics (done each decade, rather than every 4 years like in Australia, it was important not to fall back on population data more than 10 years old).

Census Blocks (US) differ slightly from Mesh Blocks (AU), but are in principle the same - the smallest statistical area unit. Census Blocks are deliberately focussed on visible boundaries, like roads, rivers, etc and often represent one "city block". They also have a larger population (typically 600-3,000 people).

There was no available categorisation of census block type (such as with "residential" mesh blocks), so a zero census population in a census block was used as a proxy for a non-residential label.

"Suburb" analysis and "Greater Sydney" type areas don't have direct analogies in US data. Instead, we aggregated the Census Blocks up to the 2020 "Places" data, which are larger in population than Australian suburbs, and a subset of them represent the metropolitan areas of US state capitals.

An improved version of the nearmap-ai-user-guides python package was used to pull the data, which should not have any methodological changes, but is able to deal better with Census Blocks (which are frequently multipolygons or contain holes, unlike Mesh Blocks).

Dealing with Seasonality

Seasonality was a much greater factor in the USA than Australia. Deciduous trees form a larger part of the canopy in the USA, and part of the Nearmap capture program is explicitly focussed on capturing the "leaf off" scenario, in order to provide maximum visibility of the cities that lie beneath. It is important to note that Nearmap AI does capture trees without leaves quite effectively - however I didn't want systematic bias introduced between the cities as the predicted areas are a little smaller (it can be hard to spot the thin, leafless edges of bare twigs at a tree's perimeter). Leaf-off occurs at different times each year, and in different locations. To avoid the need to model geographically varying seasonality explicitly I used the simpler approach of pulling all available surveys within a 24 month period (1st July 2020 to 30th June 2022), and choosing the survey date for each census block that had maximum tree cover. The typical number of surveys in 24 month window varied between two and six captures, depending on the regularity of our capture program.

As an aside, if I was to repeat this study with Gen 5 Nearmap AI data, I would be able to easily create a much more nuanced approach. There is new "leaf off" vegetation layer, which can be used to explicitly identify and ignore areas on dates impacted by seasonality.

Steps

The Nearmap Coverage API was used to check all available dates for each Census Block within the 24 month window.
The nearmap-ai-user-guides package was used to pull vegetation and building footprint data for each of these dates, for each Census Block.
The data was aggregated for percent cover of building and tree for each Census Block, and value and date were chosen to maximise coverage per block.
Various aggregations were performed with the Census Block summary data set, such as "Places" to summarise at a town or city level.

Results

领英推荐

Social geography: A special edition of Insights on…

Statistics Canada | Statistique Canada 1 个月前

A New Year message from our CEO

Findmypast 2 年前

Map Density Population: Understanding Global Trends…

Mapstack 11 个月前

National Statistics and Coverage

The Census Block summary included 4.65 million census blocks, and a population of 279.8 million; that's 83.6% of the nations population as per the 2020 census.

This data set included 110.7 million building polygons (over 10 thousand square miles of buildings), and 279 thousand square miles of tree canopy.

The median census block tree cover (for residential blocks with population > 0) was 21%.

Finally, it turned out that 52% of the USA population covered by this study were living in a "leafy" census block (defined in the same way as the Australian national study, requiring at least 20% tree cover).

In this map, yellow shows all census blocks included in the analysis, and red shows towns and cities from the Places data set.

No alt text provided for this image — Census Block (yellow) and Places (red) Map

Here, the 4.65 million census blocks are shown shaded in greens (darker for higher tree coverage).

City and Town Analysis

The Places data set includes a wide range of cities and towns - from very large, to small. The criteria for giving a valid result on a "Place" was that we needed coverage of at least 90% of the Census Blocks within the Place, and the population needed to be at least 1,000 residents.

46 of these Places represent recognised boundaries of state capitals. 377 of the Places had a population greater than 100,000, 978 Places a population of at least 50,000, 4,379 at least 10,000, and 11,684 at least 1,000.

Capital city ranking in the Leafiest Capital Cities blog post was performed by intersecting the census blocks with the Places data set. The result is a perimeter of census block surrounding each place is also included. This has the effect of placing a city in the context of its immediate surroundings, and has an influence on the results (whether the city is surrounded by national park, or other urban areas). An image of the highest ranking capital city - Charleston, West Virginia - is displayed below. Blue outlines show census block boundaries with non-zero population that were included in the analysis, and the orange outline shows the official boundary of the city from the Places data set.

The raster AI Layers are essentially the raw output of the deep learning model - in the image below, we show roof tops in orange, tree canopy in green, and overlap between the two in red. Charleston, West Virginia and Little Rock, Arkansas are two of the leafiest state capitals in the USA. However, Charleston's trees are mostly around the perimeter of the city, whereas Little Rock's are mixed much more evenly amongst the suburban houses.

When aggregated at the census block level (shaded darker for higher tree cover), this difference in distribution becomes more apparent:

Conclusion

Nearmap imagery and AI form a unique data set on which to perform earth observation on detailed urban environments, at massive scale. Longitudinal studies, inter-city, and even international comparisons are all possible to do with consistent methodology and a high degree of accuracy. Where this analysis looks solely at two of our eighty or so layers (trees and buildings), this work is equally possible with the rest - but we'll leave that analysis to another day!

Pascal Perez

Director, Australian Urban Research Infrastructure Network (AURIN)

1 年

Aaron Magri Dr Paula Hooper

2 次回应

查看更多评论

要查看或添加评论，请登录

Dr. Michael Bewley的更多文章

Hurricane Idalia through the lens of Nearmap AI

2023年9月19日

Hurricane Idalia through the lens of Nearmap AI

Now that the acute phase of the disaster has passed, I thought I’d take a little time to reflect on the technical (and…

2 条评论
Australian National Tree Cover Analysis - The Back Story

2021年12月20日

Australian National Tree Cover Analysis - The Back Story

Background In the last week, Nearmap has announced some exciting results from a national analysis of tree cover in…

18 条评论

USA Urban Tree Cover with Nearmap: Technical Back Story

Dr. Michael Bewley

Mapping the evolution of cities with petabyte scale deep learning on geospatial imagery.

Methodology

Scale

USA 2020 Census

Dealing with Seasonality

Steps

Results

领英推荐

National Statistics and Coverage

City and Town Analysis

Conclusion

Dr. Michael Bewley的更多文章

社区洞察

其他会员也浏览了

Exploring Urban Dynamics through X/Twitter: A Case Study of Greater London

Mapping the UK's space workforce: 2024 Space Census launches today

Shabbat Shalom from Cantorial Intern Sierra Fox

Measuring Ethnic Segregation in England and Wales with Linked Consumer Registers

How grouping your work into batches can help you to recapture lost time

Embracing the Australian Statistical Geography Standard Framework for Enhanced Market Insight

Unlocking Your Family's Past: How AI Supercharges Genealogy Research

5 Things on Friday - Dec 15/23

Paradigm Shift: 10 Things You Forgot (or never knew) about Alaska and Texas!

Big Data: A Historical Perspective in Present Times

Methodology

Scale

USA 2020 Census

Dealing with Seasonality

Steps

Results

领英推荐

National Statistics and Coverage

City and Town Analysis

Conclusion

Dr. Michael Bewley的更多文章

Hurricane Idalia through the lens of Nearmap AI

Australian National Tree Cover Analysis - The Back Story

社区洞察

其他会员也浏览了

Exploring Urban Dynamics through X/Twitter: A Case Study of Greater London

Mapping the UK's space workforce: 2024 Space Census launches today

Shabbat Shalom from Cantorial Intern Sierra Fox

Measuring Ethnic Segregation in England and Wales with Linked Consumer Registers

How grouping your work into batches can help you to recapture lost time

Embracing the Australian Statistical Geography Standard Framework for Enhanced Market Insight

Unlocking Your Family's Past: How AI Supercharges Genealogy Research

5 Things on Friday - Dec 15/23

Paradigm Shift: 10 Things You Forgot (or never knew) about Alaska and Texas!

Big Data: A Historical Perspective in Present Times