登录查看更多内容

Time Series Analysis of Geospatial Data

Gokulakkannan AK

Aspiring Data Analyst | Recent Graduate | Excel in Data Analytics | SQL | Python

发布日期: 2022年12月29日

From geospatial information to a pandas dataframe for time series analysis

Time series analysis of geospatial data allows us to analyze and understand how events and attributes of a place change over time. Its use cases are wide ranging, particularly in social, demographic, environmental and meteorology/climate studies. In environmental sciences, for example, time series analysis helps analyze how land cover/land use of an area changes over time and its underlying drivers. It is also useful in meteorological studies in understanding the spatial-temporal changes in weather patterns (I will shortly demonstrate one such case study using rainfall data). Social and economic sciences hugely benefit from such analysis in understanding dynamics of temporal and spatial phenomena such as demographic, economic and political patterns.

Case study: daily rainfall pattern in Hokkaido, Japan

Data source

For this case study I am using spatial distribution of rainfall in?Hokkaido prefecture, Japan?between the periods 01 January to 31 December of 2020 — accounting for 366 days of the year. I downloaded data from an open access spatial data platform?ClimateServe?— which is a product of a joint NASA/USAID partnership.

No alt text provided for this image — Snapshot of some of the raster files in local directory

Setup

First, I set up a folder where the raster dataset is stored so I can loop through them later on.

# specify folder path for raster dataset
tsFolderPath = './data/hokkaido/'

Next, I’m importing a few libraries, most of which would be familiar to data scientists. To work with raster data I’m using the?rasterio?library.


# import libraries
import os
import rasterio 
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

Visualize data

Let’s check out how the raster images look like in a plot. I’ll first load in a random image using?rasterio?and then plot it using?matplotlib?functionality.


# load in raster data
rf = rasterio.open('./data/hokkaido/20201101.tif')

fig, ax = plt.subplots(figsize=(15,5))

_ = ax.imshow(rf.read()[0], cmap = 'inferno')
fig.colorbar(_, ax=ax)
plt.axis('off')
plt.title('Daily rainfall Jan-Dec 2020, Hokkaido, Japan');

As you can see, this image is a combination of pixels, the value of each pixel represents rainfall for that particular location. Brighter pixels have high rainfall value. In the next section I am going to extract those values and transfer them into a?pandas?dataframe.

Extract data from raster files

Now into the key step — extracting pixel values for each of the 366 raster images. The process is simple: we will loop through each image, read pixel values and store them in a list.

We will separately keep track of dates in another list. Where are we getting the dates information? If you take a closer look at the file names, you’ll notice they are named after each respective day.


# create empty lists to store data
date = []
rainfall_mm = []

# loop through each raster
for file in os.listdir(tsFolderPath):
    
    # read the files
    rf = rasterio.open(tsFolderPath + file)
    
    # convert raster data to an array
    array = rf.read(1)
    
    # store data in the list
    date.append(file[:-4])
    rainfall_mm.append(array[array>=0].mean())

Note that it did not take long to loop through 366 rasters because of low image resolution (i.e. large pixel size). However, it can be computationally intensive for high resolution datasets.

领英推荐

Supercharged analytics with Spatial Indexes &…

CARTO 11 个月前

A Geospatial View of Reservoir Water Levels in…

ángel Molina Laguna 1 年前

Beyond Traditional GIS: The Age of Location…

Santosh Kumar Bhoda 1 个月前

So we just created two lists, one stores the dates from file names and the other has rainfall data. Here are first five items of two lists:


print(date[:5])
print(rainfall_mm[:5])


>> ['20200904', '20200910', '20200723', '20200509', '20200521']
>> [4.4631577, 6.95278, 3.4205956, 1.7203209, 0.45923564]

Next on to transferring the lists into a?pandas?dataframe. We will take an extra step from here to change the dataframe into a time series object.

Convert to a time series dataframe

Transferring lists to a dataframe format is an easy task in?pandas:


# convert lists to a dataframe
df = pd.DataFrame(zip(date, rainfall_mm), columns = ['date', 'rainfall_mm']) 
df.head()

We now have a?pandas?dataframe, but notice that ‘date’ column holds values in strings,?pandas?does not know yet that it represent dates. So we need to tweak it a little bit:


# Convert dataframe to datetime object
df['date'] = pd.to_datetime(df['date'])
df.head()



df['date'].info()

Now the dataframe is a datetime object.

It is also a good idea to set date column as the index. This facilitates slicing and filtering data by different dates and date range and makes plotting tasks easy. We will first sort the dates into the right order and then set the column as the index.


df = df.sort_values('date')
df.set_index('date', inplace=True)

Okay, all processing done. You are now ready to use this time series data however you wish. I’ll just plot the data to see how it looks.


# plot
df.plot(figsize=(12,3), grid =True);

Final word

Extracting interesting and actionable insights from geospatial time series data can be very powerful as it shows data both in spatial and temporal dimensions. However, for data scientists without training in geospatial information this can be a daunting task. In this article I demonstrated with a case study how this difficult task can be done easily with minimal efforts.

要查看或添加评论，请登录

Gokulakkannan AK的更多文章

How to automate reporting building with Python

2023年2月9日

How to automate reporting building with Python

Python is a powerfull programming language for data analysis, and when associated with the pandas package and SQL it…
Why Do We Need Authorization and Authentication? ??

2022年12月31日

Why Do We Need Authorization and Authentication? ??

We live in an era where digital data is becoming an increasingly valuable asset, and with that comes the need for…
How i am learning machine learning - part 0: machine learning algorithms

2022年12月30日

How i am learning machine learning - part 0: machine learning algorithms

The start I have always been interested in artificial intelligence, thats why I have begun to acquire some knowledge…
Mastering the Top 10 Statistical Concepts: The Key to Success in Data Science

2022年12月29日

Mastering the Top 10 Statistical Concepts: The Key to Success in Data Science

Unlock the full potential of your data with a deep understanding of these fundamental statistical concepts As a data…
Quantum Machine Learning: A Beginner’s Guide

2022年12月28日

Quantum Machine Learning: A Beginner’s Guide

Introduction Welcome to the world of quantum machine learning! In this tutorial, we will walk you through a…

See all articles

Time Series Analysis of Geospatial Data

Gokulakkannan AK

Aspiring Data Analyst | Recent Graduate | Excel in Data Analytics | SQL | Python

From geospatial information to a pandas dataframe for time series analysis

Case study: daily rainfall pattern in Hokkaido, Japan

Data source

Setup

Visualize data

Extract data from raster files

领英推荐

Convert to a time series dataframe

Final word

Gokulakkannan AK的更多文章

社区洞察

其他会员也浏览了

An Inside Look at Cloud-Native Geospatial Formats

Open Geospatial Data: Democratizing Access to Dynamic Intelligence

Top AI-Powered Tools for Geospatial Professionals in 2024

Geospatial Analytics | KNIME: My Journey from Montoro to Barbate

My Geospatial Journey: From 90's Web GIS to Geospatial AI

Navigating Concepts in Geospatial Data Processing

What is Spatial Data Analytics & its importance in today's world!

Exploring Geographic Databases: Understanding and Implementing Geospatial Queries with PostGIS

How Can Spatial Data Models Drive Effective Analysis?

The polygon pros: redefining geospatial data diagnostics & correction with Picterra Tracer

From geospatial information to a pandas dataframe for time series analysis

Case study: daily rainfall pattern in Hokkaido, Japan

Data source

Setup

Visualize data

Extract data from raster files

领英推荐

Convert to a time series dataframe

Final word

Gokulakkannan AK的更多文章

How to automate reporting building with Python

Why Do We Need Authorization and Authentication? ??

How i am learning machine learning - part 0: machine learning algorithms

Mastering the Top 10 Statistical Concepts: The Key to Success in Data Science

Quantum Machine Learning: A Beginner’s Guide

社区洞察

其他会员也浏览了

An Inside Look at Cloud-Native Geospatial Formats

Open Geospatial Data: Democratizing Access to Dynamic Intelligence

Top AI-Powered Tools for Geospatial Professionals in 2024

Geospatial Analytics | KNIME: My Journey from Montoro to Barbate

My Geospatial Journey: From 90's Web GIS to Geospatial AI

Navigating Concepts in Geospatial Data Processing

What is Spatial Data Analytics & its importance in today's world!

Exploring Geographic Databases: Understanding and Implementing Geospatial Queries with PostGIS

How Can Spatial Data Models Drive Effective Analysis?

The polygon pros: redefining geospatial data diagnostics & correction with Picterra Tracer