Histograms of DEM Values

histogram is an accurate representation of the distribution of numerical data. It is an estimate of the probability distribution of a continuous variable and was first introduced by Karl Pearson (Pearson, 1895). Histograms give a rough sense of the density of the underlying distribution of the data, and often for density estimation: estimating the probability density function of the underlying variable. The total area of a histogram used for probability density is always normalized to 1

When plotting the histogram of DEM values, it could represent the distribution of pixel elevation values. This plot is useful to:

  • Identify outlier data values
  • Assess the min and max values in the data
  • Explore the general distribution of elevation values in the data - i.e. is the area generally flat, hilly, is it high elevation or low elevation.

In this notebook, xarray + rasterio are still used to read data, and seaborn is used to plot histograms. Seaborn is a Python data visualization library based on matplotlib. It provides a high-level interface for drawing attractive and informative statistical graphics.

%matplotlib inline

import numpy as np
import xarray as xr
from scipy.stats import norm

import seaborn as sns
import matplotlib.pyplot as plt

import warnings
warnings.filterwarnings("ignore")

Read data

infile = "data/FloodFill/terrain.asc"

da_dem = xr.open_rasterio(infile).drop('band')[0].rename({'x':'longitude', 'y':'latitude'})
nodata = da_dem.nodatavals[0]
da_dem = da_dem.where(da_dem>nodata, np.nan)

DEM Histogram

  • Convert to 1D array so it can be coerced into a histogram using .ravel()
  • remove nan values as sns.distplot does not support nans.
arr_dem = da_dem.values.ravel()
arr_dem = arr_dem[~np.isnan(arr_dem)]

Plotting

A little bit slow as there are too many grids.

fig, ax = plt.subplots(figsize=(12, 6))
sns.distplot(arr_dem, 
             kde=False, 
             rug=True,
             ax=ax
            )

_ = ax.set_title("Distribution of DEM Elevation Values", fontsize=16)
_ = ax.set_xlabel('Elevation (meters)')
_ = ax.set_ylabel('Frequency')
No alt text provided for this image

A histogram is a good way to show how the data are distributed. Each bin or bar in the plot represents the number or frequency of pixels that fall within the range specified by the bin.

In addition, there are quite a few options in seaborn.distplot(a, bins=None, hist=True, kde=True, rug=False, fit=None, hist_kws=None, kde_kws=None, rug_kws=None, fit_kws=None, color=None, vertical=False, norm_hist=False, axlabel=None, label=None, ax=None), where the bins= argument can be used to specify fewer or more breaks in a histogram. Note that this argument does not result in the exact number of breaks that you may want in your histogram. Different combination will present different visualization effects. Worth a try!

Fitting a distribution

From the above histogram, the DEM values seems to be able to fitted into a normal distribution. Let's have a look.

fig, ax = plt.subplots(figsize=(12, 6))
sns.distplot(arr_dem, 
             fit=norm, 
             kde=False, 
             ax=ax)

_ = ax.set_title("Distribution of DEM Elevation Values", fontsize=16)
_ = ax.set_xlabel('Elevation (meters)')
_ = ax.set_ylabel('Probability')
No alt text provided for this image

References

Pearson, K. (1895). Contributions to the Mathematical Theory of Evolution. II. Skew Variation in Homogeneous Material. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 186: 343–414. Bibcode:1895RSPTA.186..343P. doi:10.1098/rsta.1895.0010

Travis E, Oliphant. A guide to NumPy, USA: Trelgol Publishing, (2006).

Stéfan van der Walt, S. Chris Colbert and Ga?l Varoquaux. The NumPy Array: A Structure for Efficient Numerical Computation, Computing in Science & Engineering, 13, 22-30 (2011), DOI:10.1109/MCSE.2011.37

Fernando Pérez and Brian E. Granger. IPython: A System for Interactive Scientific Computing, Computing in Science & Engineering, 9, 21-29 (2007), DOI:10.1109/MCSE.2007.53

John D. Hunter. Matplotlib: A 2D Graphics Environment, Computing in Science & Engineering, 9, 90-95 (2007), DOI:10.1109/MCSE.2007.55

https://rasterio.readthedocs.io/en/stable/

https://xarray.pydata.org/en/stable/

https://en.wikipedia.org/wiki/Histogram

https://seaborn.pydata.org/generated/seaborn.distplot.html#seaborn.distplot

要查看或添加评论,请登录

Chonghua Yin的更多文章

社区洞察

其他会员也浏览了