登录查看更多内容

Spatial-temporal Decomposition Methods in Climate Data Analysis

Chonghua Yin

Climate Scientist | Data Scientist | Developer

发布日期: 2018年4月23日

Large datasets are increasingly widespread in many disciplines, which absolutely include Climate and Weather. The climate system is the result of highly complex interactions between many degrees of freedom or modes. In order to gain insight into understanding the dynamical/physical behavior involved, methods are required to drastically reduce their dimensionality in an interpretable way, such that most of the information in the data is preserved. This has led to the development by atmospheric researchers of methods that give a space display and a time display of large space-time atmospheric data.

Among these methods, principal component analysis (PCA)/Empirical Orthogonal Function (EOF) is one of the oldest and most widely used. In order to overcome some limitations of classical EOF/PCA analysis and make the resulting patterns more physically interpretable, many extensions have been developed such as rotated EOFs, Extended EOFs, and complex EOFs. A review of PCA/EOFs can be found in Kutzbach (1967), Hannachi (2004), and Hannachi et al. (2007). Non-EOF methods are also mentioned a little bit.

This short note does not intend to list all EOF-related methods. It is just a small memorandum where to get a clue when a data analysis requires a spatial-temporal decomposition method.

Methods

· PCA – Principal Component Analysis

PCA is a statistical procedure that uses an orthogonal transformation to convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components. The earliest literature on PCA dates from Pearson (1901) and Hotelling (1933). PCA can be presented as

y = v’x.

In simple words, the principal component analysis is a method of extracting important variables (in form of components) from a large set of variables available in a data set. It extracts a low dimensional set of features from a high dimensional data set with a motive to capture as much information as possible. With fewer variables, visualization also becomes much more meaningful. PCA is more useful when dealing with 3 or higher dimensional data.

The first principal component accounts for as much of the variability in the data as possible, and each succeeding component accounts for as much of the remaining variability as possible.

· EOF - Empirical Orthogonal Function

One discipline in which PCA has been widely used is atmospheric science. It was first suggested in that field by Obukhov (1947) and Lorenz (1956) and, uniquely to that discipline, it is usually known as empirical orthogonal function (EOF) analysis. EOF analysis is a standard method in the earth and marine sciences for exploring spatio-temporal variation in a variable. The simplicity and the analytic derivation of EOFs are the main reasons behind its popularity in atmospheric science. EOF can be presented as

X = VY.

The original purpose of EOFs was to reduce a large number of variables of the original data to a few variables, but without compromising much of the explained variance. Lately, however, EOF analysis has been used to extract individual modes of variability such as the Arctic Oscillation (AO).

· MVEOF – Multi-Variate EOF

MVEOF, also called combined EOF, extends conventional EOF by use of both spatial and intervariable coherence, which enables a more efficient compaction of multifield data. More important, it may extract dominant patterns in the spatial phase relationships among various fields of the derived empirical orthogonal functions. This often leads to physical insight into the interactive processes within a complex system such as the ocean-atmosphere climate system (Wang, 1992; Wang et al. 2008; He et al. 2015).

· EEOF – Extended EOF

EEOFs constitute an extension of the traditional MVEOF to deal not only with spatial- but also with temporal correlations observed in weather/climate data, in which the additional variables are lagged versions of the same process. The method was first introduced by Weare and Nasstrom (1982) who applied it to the 300-mb relative vorticity to identify propagating structures.

· CVEOF – Complex Vector EOF

Generally, some variables, as the wind has two components of Zonal and meridional winds, can be presented as a vector format. For example, the wind can be presented as complex format as w = u + iv. CVEOF can be used to carry out EOF analysis on the complex matrix just as the conventional EOF/PCA does.

· JEOF – Joint EOF

JEOF is an extension of EEOF, which deals with two variables rather than a single variable in the original EEOF. JEOF should be supplied with the normalized version of original variables. This is because the original variables may have different scales.

· CEOF – Complex EOF

Empirical orthogonal function analysis of data fields is commonly carried out under the assumption that each field can be represented as a spatially fixed pattern of behavior. This method, however, cannot be used to for detection of propagating features because of the lack of phase information. Under such a case, the CEOF technique was proposed (e.g., Davis 1976; Horel 1984; Barnett 1985; Preisendorfer 1988; Kaihatu et al. 1998) to serve as a separation of variables in space and time. Complex EOF can more effectively capture the structure of non-stationary periodic variations or two orthogonal variables (e.g., zonal or meridional velocity) in fewer modes.

The Complex Empirical Orthogonal Function (CEOF) was introduced to analyze a set of time series data that have phase lag among them by adding components that are the original time series data rotated by 90 degrees on a complex plane using a mathematical method called Hilbert transform. This method is close to frequency domain EOF but it does not require converting data in the time domain into the frequency domain explicitly in the process.

· POP – Principal Oscillation Pattern

The principal oscillation pattern (POP) analysis is a technique used to simultaneously infer the characteristic patterns and timescales of a vector time series. The POPs may be seen as the normal modes of a linearized system whose system matrix is estimated from data (von Storch et al., 1995; Gehne et al., 2014).

The POP method is not a tool that is useful in all applications. If the analyzed vector time series exhibit a strongly nonlinear behavior, the POPs may fail to identify a useful subsystem. However, if a significant portion of the variability of a nonlinear system is controlled by linear dynamics, the POP analysis may be successful in extracting principal modes of oscillation.

· ICA – Independent Component Analysis

Independent component analysis (ICA) is a statistical and computational technique for revealing hidden factors that underlie sets of random variables, measurements, or signals.

ICA defines a generative model for the observed multivariate data, which is typically given as a large database of samples. In the model, the data variables are assumed to be linear mixtures of some unknown latent variables, and the mixing system is also unknown. The latent variables are assumed non-Gaussian and mutually independent, and they are called the independent components of the observed data. These independent components, also called sources or factors, can be found by ICA.

ICA is superficially related to principal component analysis and factor analysis. Sometimes, the ICA is viewed as a method of EOF rotation. Starting from an initial EOF solution rather than rotating the loadings toward simplicity, ICA seeks a rotation matrix that maximizes the independence between the components in the time domain. If the underlying climate signals have an independent forcing, one can expect to find loadings with interpretable patterns whose time coefficients have properties that go beyond simple non-correlation observed in EOFs.

Often, ICA is more appropriate than PCA to analyze time series, since the extraction of independent components (ICs) involves higher-order statistics whereas PCA only uses the second-order statistics to obtain the principal components (PCs), which are not correlated and are not necessarily independent.

· ST-MEMD - Spatio-Temporal Multivariate Empirical Mode Decomposition

ST-MEMD is a variation of classic Empirical Mode Decomposition (EMD) that takes spatial and temporal information into account, simultaneously. The original EMD only processes each signal source in isolation. However, whilst ST-MEMD retained the increase in sensitivity and specificity from adding spatial data, the new temporal data made no meaningful difference in terms of performance (Davies and James, 2014).

References

Gehne et al. (2014): Irregularity and decadal variation in ENSO: A simplified model based on Principal Oscillation Patterns. Climate Dynamics, 43:3327-3350.

Hannachi, A., 2004: A primer for EOF analysis of climate data. University of Reading, 33 pp.

Hannachi, A., I. T. Jolliffe, and D. Stephenson (2007), Empirical orthogonal functions and related techniques in atmospheric science: A review, Int. J. Climatol, 27, 1119–1152.

He, J., L.-Y. Chang, and H. Chen, 2015: Meridional propagation of the 30- to 60-day variability of precipitation in the East Asian subtropical summer monsoon region: Monitoring and prediction. Atmos.–Ocean, 53, 251–263, doi:10.1080/07055900.2015.1017798

Hotelling H. 1933. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 24, 417–441, 498–520 (doi: 10.1037/h0071325)

Kutzbach, J. E., 1967: Empirical eigenvectors of sea-level pressure, surface temperature and precipitation complexes over North America. J. Appl. Meteor., 6, 791-802.

Obukhov AM. 1947. Statistically homogeneous fields on a sphere. Usp. Mat. Navk. 2, 196–198.

Lorenz EN. 1956. Empirical orthogonal functions and statistical weather prediction. Technical report, Statistical Forecast Project Report 1, Dept. of Meteor. MIT: 49.

Pearson K. 1901. On lines and planes of closest fit to systems of points in space. Phil. Mag. 2, 559–572. (Doi: 10.1080/14786440109462720).

S. R. H. Davies, C. J. James, "Using empirical mode decomposition with spatio-temporal dynamics to classify single-trial motor imagery in BCI", 36th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., pp. 4631-4, Aug. 2014.

von Storch, Hans, Gerd Berger, Reiner Schnur, Jin-Song von Storch, 1995: Principal Oscillation Patterns: A Review. J. Climate, 8, 377

Weare BC, Nasstrom JS. 1982. Examples of extended empirical orthogonal function analysis. Monthly Weather Review 110: 481–485.

Wang, B., 1992: The vertical structure and development of the ENSO anomaly mode during 1979–1989. J. Atmos. Sci., 49, 698–712, doi: 10.1175/1520-0469(1992)049, 0698:TVSADO.2.0.CO; 2

Wang, B., Z. Wu, J. Li, J. Liu, C.-P. Chang, Y. Ding, and G. Wu, 2008: How to measure the strength of the East Asian summer monsoon. J. Climate, 21, 4449–4463, doi:10.1175/2008JCLI2183.1.

Spatial-temporal Decomposition Methods in Climate Data Analysis

Chonghua Yin

Climate Scientist | Data Scientist | Developer

Methods

· PCA – Principal Component Analysis

· EOF - Empirical Orthogonal Function

· MVEOF – Multi-Variate EOF

· EEOF – Extended EOF

· CVEOF – Complex Vector EOF

· JEOF – Joint EOF

· CEOF – Complex EOF

· POP – Principal Oscillation Pattern

· ICA – Independent Component Analysis

· ST-MEMD - Spatio-Temporal Multivariate Empirical Mode Decomposition

References

更多精彩文章

社区洞察

其他会员也浏览了

The Power of Land Surface Temperature: What Makes It Important, and Which Spatial Scale Drives Solutions?

Prithvi-weather-climate: A Collaborative Triumph by NASA and IBM Research

How to pursue a data science career in the geoscience field?

Earth Science Big Data Application Report

Spilhaus World Ocean Map

Insights into Climate Change: AI and Remote Sensing for Monitoring Glacier Retreat and Ice Loss

A BRIEF ON METEOROLOGY

AI-Driven Insights into Atmospheric Chemistry: Exploring Recent Advances and Future Frontiers

What if climate science appears unsettled?

Methods

· PCA – Principal Component Analysis

· EOF - Empirical Orthogonal Function

· MVEOF – Multi-Variate EOF

· EEOF – Extended EOF

· CVEOF – Complex Vector EOF

· JEOF – Joint EOF

· CEOF – Complex EOF

· POP – Principal Oscillation Pattern

· ICA – Independent Component Analysis

· ST-MEMD - Spatio-Temporal Multivariate Empirical Mode Decomposition

References

Direct Access to NetCDF Files in TAR Archives

2024年8月30日

Merge Overlapping Rasters Using R and Terra

2024年3月22日

Merge Overlapping Rasters Using python and rioxarray

2024年3月8日

Merge Overlapping Rasters Using Python and GDAL VRT Pixel Functions

2024年1月29日

Spatial Stats of Raster Upon Polygons - Zonal Operations

2023年12月20日

An Introduction To Data Analysis Workflow

2023年12月16日

Revisit Degree Days

2023年11月18日

Efficient Point Data Extraction from Zarr Datasets with FastAPI, Dask, and Xarray

2023年11月4日

Developing Domain Knowledge: Your Own Business

2023年10月20日

Domain Knowledge: A Distinctive Necessity for Data Scientists

2023年10月14日

社区洞察

其他会员也浏览了

The Power of Land Surface Temperature: What Makes It Important, and Which Spatial Scale Drives Solutions?

Prithvi-weather-climate: A Collaborative Triumph by NASA and IBM Research

How to pursue a data science career in the geoscience field?

Earth Science Big Data Application Report

Spilhaus World Ocean Map

Insights into Climate Change: AI and Remote Sensing for Monitoring Glacier Retreat and Ice Loss

A BRIEF ON METEOROLOGY

AI-Driven Insights into Atmospheric Chemistry: Exploring Recent Advances and Future Frontiers

What if climate science appears unsettled?