登录查看更多内容

Spatial Stats of Raster Upon Polygons - Zonal Operations

Chonghua Yin

Head of Data Science | Climate Risk & Extreme Event Modeling | AI & Geospatial Analytics

发布日期: 2023年12月20日

Map algebra, alternatively referred to as cartographic modeling, constitutes a straightforward yet potent algebraic approach that employs an array of tools, operators, and functions for conducting geographical analysis with raster data. This method proves effective in manipulating geographic data through mathematical functions to achieve specific outcomes or information.

There are four basic function types used in map algebra:

§? Local Operations: involving the scrutiny of raster data cell by cell, where the value in each cell of one layer is compared with the values in the corresponding cell in other layers, such as addition, minus, etc.

§? Focal Operations: entailing the comparison of the value in each cell with the values in its neighboring cells. For example, convolution, kernel, and moving windows are focal operations.

§? Global Operations: generating results that are applicable to the entire layer, such as adding a scaling factor to all cells.

§? Zonal Operations: computing results for within a specified zone defined by rater or vector. An instance includes determining the total precipitation for a watershed.

The local and global operations are quite straightforward and easy to implement. We presented focal operations before. In this tutorial, we will focus on the most interesting map algebra functional type of zonal operations, which also is one of the most prevalent tasks for developers of spatial analysis applications. For example, we can apply the functions to assess the risk of buildings being impacts of floods, wildfires, or other natural hazards in small-scale areas (defined by polygons) (Figure 1). The building data could be obtained from Google’s Open Buildings or Microsoft’s Worldwide building footprints derived from satellite imagery.

Figure 1. Buildings in red under the risk (map created by the author)

The conventional method involves the clipping of raster data with each asset polygon and then performing statistical analysis individually (e.g., using rasterstats, rioxarray, salem, xarray, geopandas, etc.). This method is intuitive and easy to understand. However, this approach will become highly inefficient when there are many polygons (e.g., over hundreds of thousands). As a result, some people have considered using a parallel approach, where asset polygons are divided into several groups, and extraction and statistics are performed on each group separately (e.g., Dask-GeoPandas). This approach indeed increases processing speed, but its essence remains iterative, albeit with grouping, so the speed improvement is still relatively limited when dealing with many polygons.

We can contemplate a reverse approach as well. Why not digitize the asset polygons directly into raster data with a style closely mirroring the target raster data (e.g., the flood raster in Figure 1), including the same spatial domain and resolution? By pursuing this strategy, we can leverage a distinct field within the property polygons as a key to pinpoint the location of each asset polygon. This process effectively generates a mask matrix to extract data from the target raster and conduct subsequent statistical analyses (e.g., Figure 2). The tools to rasterize polygons include rasterio, regionmask, geocube, superstar gdal, etc.

Figure 2. Assets masks overlapped on a risk map (map created by the author)

Indeed, there are other choices available. We can employ specialized spatial analysis software to finish such kinds of tasks. For instance, ArcGIS Pro provides the ZonalStatisticsAsTable function, or you can utilize QgsZonalStatistics within QGIS. It's important to note that this approach necessitates the installation of these software packages to access their functionalities. This might not align with your preferences, especially if you seek to utilize lightweight scripts exclusively. Read the technical document about ArcGIS Pro’s ZonalStatisticsAsTable. You'll notice that it initiates the rasterization of polygons to generate masking arrays, essentially sharing the same concept as the second method.

Of course, each approach has its strengths and weaknesses. If you have a relatively small amount of polygonal data, you may opt for the first method. However, the second method may be more suitable for dealing with many polygons. The first two methods provide greater flexibility and versatility. On the other hand, if your analysis workflow is closely tied to a specific software, such as ArcGIS or QGIS, then it's advisable to consider the third approach. This ensures compatibility with your software and can streamline your analysis process within that particular environment.

References

https://help.arcgis.com/en/arcgisdesktop/10.0/help/index.html#//00p600000002000000

领英推荐

A deep dive into... scatter plots

Datylon 2 年前

Uniform Manifold Approximation and Projection

Patrick Nicolas 6 个月前

Riemannian Metric for SPD Manifolds

Patrick Nicolas 6 个月前

https://wiki.gis.com/wiki/index.php/Map_algebra#cite_note-1

https://pythonhosted.org/rasterstats/

https://corteva.github.io/rioxarray/stable/

https://docs.xarray.dev/en/stable/

https://salem.readthedocs.io/en/stable/

https://geopandas.org/en/stable/index.html

https://dask-geopandas.readthedocs.io/en/stable/

https://regionmask.readthedocs.io/en/stable/

https://rasterio.readthedocs.io/en/stable/#

https://gdal.org/index.html

https://corteva.github.io/geocube/stable/index.html

https://pro.arcgis.com/en/pro-app/latest/tool-reference/spatial-analyst/zonal-statistics-as-table.htm

https://qgis.org/pyqgis/master/analysis/QgsZonalStatistics.html

要查看或添加评论，请登录

Chonghua Yin的更多文章

SPEI: A Smarter Way to Measure Drought

2025年3月8日

SPEI: A Smarter Way to Measure Drought

When we think about drought, we often focus on rainfall—how much (or little) precipitation a place receives. But is…
NaN Wrangling: LOESS/LOWESS to the Rescue

2025年3月8日

NaN Wrangling: LOESS/LOWESS to the Rescue

Have you ever tried interpolating geospatial data near coastlines, only to find your results ruined by NaN (Not a…

2 条评论
Efficient Geospatial Nearest Neighbor Search with KDTree and xarray

2025年3月1日

Efficient Geospatial Nearest Neighbor Search with KDTree and xarray

When working with large-scale geospatial data, efficient nearest neighbor search is crucial. This article explores how…
Unlocking Data's Potential: Four Types of Analytics

2025年2月21日

Unlocking Data's Potential: Four Types of Analytics

In today's data-driven world, businesses that can harness the power of analytics gain a significant competitive edge…
Analytics: Team Driven

2025年2月19日

Analytics: Team Driven

A data analytics team’s strength doesn’t come from a single exceptional individual but from the collective impact of…
Secret to Product Longevity: Simplicity, Support, and Feedback

2025年2月15日

Secret to Product Longevity: Simplicity, Support, and Feedback

In today's rapidly evolving tech landscape, products constantly emerge and transform. Yet, some stand the test of time,…
Flying High: A Simple Metaphor for Business

2025年2月11日

Flying High: A Simple Metaphor for Business

I recently discussed the relationship between marketing and sales with a friend. During our conversation, he used a…
Project Life Cycle vs. Product Life Cycle: Embracing Agile Product Thinking

2025年2月6日

Project Life Cycle vs. Product Life Cycle: Embracing Agile Product Thinking

In business management, grasping project and product life cycle disparities is paramount. Although both concepts entail…
Separating Data APIs and Business Logic with an API Gateway

2025年1月23日

Separating Data APIs and Business Logic with an API Gateway

Today, I conversed with a friend about separating data APIs from business logic. Coincidentally, my friend is a wine…
Direct Access to NetCDF Files in TAR Archives

2024年8月30日

Direct Access to NetCDF Files in TAR Archives

Recently, I need to validate the performance of wind data from CONUS404 against observational data at a specific site…

See all articles

Spatial Stats of Raster Upon Polygons - Zonal Operations

Chonghua Yin

Head of Data Science | Climate Risk & Extreme Event Modeling | AI & Geospatial Analytics

References

领英推荐

Chonghua Yin的更多文章

社区洞察

其他会员也浏览了

Fundamentals of Quantization - Quantization of LLMs, Part-3

Contributions to Big Geospatial Data Rendering and Visualisations - Ph.D Thesis: Design chapter 6

Claude 2 and undirected graph theory

STEP-BY-STEP-APPROACH-TO CLASSIFY-THE-PERSON-HAVING-CANCER-OR-NOT-USING-MLAI ALGORITHMS

Using Ruby to add Nodes to Subcatchments and Polygons in ICM InfoWorks and SWMM Networks

Avoiding Temporaries With Expression Templates

Unlocking Insights: Visualizing Chemical Thermodynamics Data with Matplotlib

What is Spatial Interpolation? What are the different methods of Interpolation used in GIS?

Ever Wondered How Google Maps Finds the Shortest Route? The Secret Lies in Graph Theory!

Linear Algebra and Tensors: A Data Scientist’s Guide to Multidimensional Data

References

领英推荐

Chonghua Yin的更多文章

SPEI: A Smarter Way to Measure Drought

NaN Wrangling: LOESS/LOWESS to the Rescue

Efficient Geospatial Nearest Neighbor Search with KDTree and xarray

Unlocking Data's Potential: Four Types of Analytics

Analytics: Team Driven

Secret to Product Longevity: Simplicity, Support, and Feedback

Flying High: A Simple Metaphor for Business

Project Life Cycle vs. Product Life Cycle: Embracing Agile Product Thinking

Separating Data APIs and Business Logic with an API Gateway

Direct Access to NetCDF Files in TAR Archives

社区洞察

其他会员也浏览了

Fundamentals of Quantization - Quantization of LLMs, Part-3

Contributions to Big Geospatial Data Rendering and Visualisations - Ph.D Thesis: Design chapter 6

Claude 2 and undirected graph theory

STEP-BY-STEP-APPROACH-TO CLASSIFY-THE-PERSON-HAVING-CANCER-OR-NOT-USING-MLAI ALGORITHMS

Using Ruby to add Nodes to Subcatchments and Polygons in ICM InfoWorks and SWMM Networks

Avoiding Temporaries With Expression Templates

Unlocking Insights: Visualizing Chemical Thermodynamics Data with Matplotlib

What is Spatial Interpolation? What are the different methods of Interpolation used in GIS?

Ever Wondered How Google Maps Finds the Shortest Route? The Secret Lies in Graph Theory!

Linear Algebra and Tensors: A Data Scientist’s Guide to Multidimensional Data