Getting started with Spatial Indexes
Spatial datasets are rapidly becoming bigger and more complex. The traditional tools and methods used to process and analyze these are increasingly failing to meet the challenges associated with this complexity.
Enter Spatial Indexes.
Spatial Indexes - sometimes referred to as Data Cubes and Global Discrete Grid Systems - are global grid systems which tessellate the world into regular, evenly-shaped grid cells to encode location. They are gaining in popularity with all geospatial roles as they are designed for extremely fast and performant big data analysis. They are hierarchical, with resolutions ranging from feet to miles, and with direct relationships between "parent", "child" and "neighbor" cells at different resolutions, enabling extremely performant spatial operations.
How are organizations using Spatial Indexes?
Organizations are increasingly using Spatial Indexes as a “geographical support system.” By transforming data into these grids, they are able to analyze bigger data faster - and therefore more efficiently and quickly.
Spatial Indexes are particularly popular with companies which generate large amounts of location data, and whose profitability is reliant on understanding this. This location data might be GPS tracks, telemetry, remote sensing or customer transactions. By shifting from traditional geometries to the more efficient Spatial Indexes approach, they are overcoming both operational and analytical limitations.
What are the advantages of Spatial Indexes?
There are five key benefits to using Spatial Indexes.
1 Efficiency
Spatial Indexes enable much faster and more computationally efficient storage and query performance, minimizing the downtime spent waiting for processes to complete. Traditional geometries are described in data as a series of vertices which can require a large amount of storage, particularly for geometrically complex features. In contrast, Spatial Indexes are described by a reference string which is much shorter, therefore taking up much less storage space. This also supports very fast numerical/mathematical operations that perform geospatial operations very efficiently.
What does this look like in practice? Well, a 15-character H3 string takes up 19-bytes of storage in BigQuery. Contrast this with a typical point (40 bytes) or polygon (8.6 kilobytes) and you can see the difference in scale (illustrated below). In terms of processing time, a spatial join (see section 5 for further details) that would take 26 minutes and 5 seconds using geometries takes just 3 seconds when working with spatial indexes. That’s a saving of 99.8%!
领英推荐
2 Flexibility and scalability
Spatial datasets can often be sourced from a wide range of geographical zones and resolutions which can make them different to compare and cross-analyze. Combining these into one single-resolution spatial index allows for ease and flexibility in comparison and analysis. Converting data to a spatial index is straightforward (see section 5 for a guide) and much faster than converting to a different geometry zone; a index-to-index enrichment is typically 99.8% faster than a polygon-to-polygon enrichment.?
They are also highly scalable. With traditional geometries, query costs increase exponentially with geometry size and complexity. In contrast, Spatial Indexes are far more scalable with the cost of queries experiencing little increase for larger numbers of features. This harnesses the true, highly distributed power of Spatial Data Warehouses.
3 Visual Clarity
Spatial index cells being the same shape and (normally) the same size means they are much more intuitive to understand, making communication of your analysis much easier. This is because typically administrative geographies are sized to have roughly the same population size within each of them. This means zones in dense urban areas can be so small that the map reader can barely see them.
4 Objectivity
Working with Spatial Indexes can mitigate the bias associated with irregular administrative geographies. Boundary manipulation or “Gerrymandering” is a way of redrawing administrative boundaries to obtain a specific outcome such as an election result. This can distort the outcomes of spatial analysis which can be mitigated through the use of objectively-drawn Spatial Indexes.?
5 Collaboration
For the above reasons, the use of grids to analyze and visualize spatial data is a popular mechanism. Using traditional GIS methods this would often involve analysts independently creating custom grids. However, by tapping into an open global Spatial Index, data scientists can much more fluidly share data and collaborate.
Get started with Spatial Indexes!
This article is just the first chapter of our e-book Spatial Indexes 101! Download your FREE copy to learn more about Spatial Indexes, including practical examples and exercises!
?? GIS Sales Engineer | ??? I help companies with the technical part of sales process. | ?? Get my GIS Jobs Newsletter (19 000+ subscribers).
1 年Thanks for writing this comprehensive article on spatial indexes. I am looking forward to reading the whole e-book!