All About Time Series Analysis and Forecasting

All About Time Series Analysis and Forecasting

What is time series analysis?

Time series analysis is a specific way of analyzing a sequence of data points collected over an interval of time. In time series analysis, analysts record data points at consistent intervals over a set period of time rather than just recording the data points intermittently or randomly. What sets time series data apart from other data is that the analysis can show how variables change over time. Time series analysis typically requires a large number of data points to ensure consistency and reliability. An extensive data set ensures you have a representative sample size and that analysis can cut through noisy data. It also ensures that any trends or patterns discovered are not outliers and can account for seasonal variance. Additionally, time series data can be used for forecasting—predicting future data based on historical data.

When time series analysis is used and when it isn’t

Time series analysis is used for non-stationary data—things that are constantly fluctuating over time or are affected by time. Industries like finance, retail, and economics frequently use time series analysis because currency and sales are always changing. Stock market analysis is an excellent example of time series analysis in action, especially with automated trading algorithms. Likewise, time series analysis is ideal for forecasting weather changes, helping meteorologists predict everything from tomorrow’s weather report to future years of climate change. Examples of time series analysis in action include:

  • Weather data
  • Rainfall measurements
  • Temperature readings
  • Heart rate monitoring (EKG)
  • Brain monitoring (EEG)
  • Quarterly sales
  • Stock prices
  • Automated stock trading
  • Industry forecasts
  • Interest rates

Classification and considerations

While time series data is data collected over time, there are different types of data that describe how and when that time data was recorded. For example:

  • Time series data is data that is recorded over consistent intervals of time.
  • Cross-sectional data consists of several variables recorded at the same time.
  • Pooled data is a combination of both time series data and cross-sectional data.

Further, time series data can be classified into two main categories:

  • Stock time series data?means measuring attributes at a certain point in time, like a static snapshot of the information as it was.
  • Flow time series data?means measuring the activity of the attributes over a certain period, which is generally part of the total whole and makes up a portion of the results.

In time series data, variations can occur sporadically throughout the data:

  • Functional analysis?can pick out the patterns and relationships within the data to identify notable events.
  • Trend analysis?means determining consistent movement in a certain direction. There are two types of trends: deterministic, where we can find the underlying cause, and stochastic, which is random and unexplainable.
  • Seasonal variation?describes events that occur at specific and regular intervals during the course of a year. Serial dependence occurs when data points close together in time tend to be related.

Components for Time Series Analysis

The various reasons or the forces which affect the values of an observation in a time series are the components of a time series. The four categories of the components of time series are

  • Trend
  • Seasonal Variations
  • Cyclic Variations
  • Random or Irregular movements

Seasonal and Cyclic Variations are the periodic changes or short-term fluctuations.

No alt text provided for this image

1. Trends?

Trend is nothing but a movement to relatively higher or lower values over a long period. So, when a time series analysis shows a general pattern that is upward, we call it an uptrend, and when the trend exhibits a lower pattern, that is a downward trend.

Linear and Non-Linear Trend : If we plot the time series values on a graph in accordance with time t. The pattern of the data clustering shows the type of trend. If the set of data cluster more or less round a straight line, then the trend is linear otherwise it is non-linear (Curvilinear).

The following graph depicts a series in which there is an obvious upward trend over time:

No alt text provided for this image

2. Seasonal Variations

Seasonal variation or Seasonality is a repeating pattern within a fixed period. Seasonality in a time series can be identified by regularly spaced peaks and troughs which have a consistent direction and approximately the same magnitude every year, relative to the trend. The following diagram depicts a strongly seasonal series. There is an obvious large seasonal increase in December retail sales in New South Wales due to Christmas shopping. In this example, the magnitude of the seasonal component increases over time, as does the trend.

No alt text provided for this image

3. Cyclic Variations

It is somewhat like seasonality, but in cyclicity, the duration is unfixed, and the gap length of time between two cycles can be much longer.?

4. Random or Irregular Movements

The irregular component (sometimes also known as the residual) is what remains after the seasonal and trend components of a time series have been estimated and removed. It results from short term fluctuations in the series which are neither systematic nor predictable. In a highly irregular series, these fluctuations can dominate movements, which will mask the trend and seasonality. The following graph is of a highly irregular time series:

No alt text provided for this image

Types of time series analysis

Even within time series analysis, there are different types and models of analysis that will achieve different results.

  • Classification:?Identifies and assigns categories to the data.
  • Curve fitting:?Plots the data along a curve to study the relationships of variables within the data.
  • Descriptive analysis:?Identifies patterns in time series data, like trends, cycles, or seasonal variation.
  • Explanative analysis:?Attempts to understand the data and the relationships within it, as well as cause and effect.
  • Exploratory analysis:?Highlights the main characteristics of the time series data, usually in a visual format.
  • Forecasting:?Predicts future data. This type is based on historical trends. It uses the historical data as a model for future data, predicting scenarios that could happen along future plot points.
  • Intervention analysis:?Studies how an event can change the data.
  • Segmentation:?Splits the data into segments to show the underlying properties of the source information.

Modelling time series

There are many ways to model a time series in order to make predictions.

  • moving average
  • exponential smoothing
  • ARIMA

A. Moving Average: The moving average model is probably the most naive approach to time series modelling. This model simply states that the next observation is the mean of all past observations. Although simple, this model might be surprisingly good and it represents a good starting point.Otherwise, the moving average can be used to identify interesting trends in the data. We can define a?window?to apply the moving average model to?smooth?the time series, and highlight different trends.

No alt text provided for this image

In the plot above, we applied the moving average model to a 24h window. The green line?smoothed?the time series, and we can see that there are 2 peaks in a 24h period.

Of course, the longer the window, the?smoother?the trend will be. Below is an example of moving average on a smaller window. 12h Window example shown below

No alt text provided for this image

B. Exponential smoothing

Exponential smoothing uses a similar logic to moving average, but this time, a different?decreasing weight?is assigned to each observations. In other words,?less importance?is given to observations as we move further from the present. Mathematically, exponential smoothing is expressed as:

No alt text provided for this image

Here,?alpha?is a?smoothing factor?that takes values between 0 and 1. It determines how?fast?the weight decreases for previous observations.

No alt text provided for this image

From the plot above, the dark blue line represents the exponential smoothing of the time series using a smoothing factor of 0.3, while the orange line uses a smoothing factor of 0.05.

As you can see, the smaller the smoothing factor, the smoother the time series will be. This makes sense, because as the smoothing factor approaches 0, we approach the moving average model.

Double exponential smoothing

Double exponential smoothing is used when there is a trend in the time series. In that case, we use this technique, which is simply a recursive use of exponential smoothing twice.

No alt text provided for this image

Here,?beta?is the?trend smoothing factor, and it takes values between 0 and 1.

Below, you can see how different values of?alpha?and?beta?affect the shape of the time series.

No alt text provided for this image

Triple exponential smoothing

This method extends double exponential smoothing, by adding a?seasonal smoothing factor. Of course, this is useful if you notice seasonality in your time series.

Mathematically, triple exponential smoothing is expressed as:

No alt text provided for this image

Where?gamma?is the seasonal smoothing factor and?L?is the length of the season.

C. ARIMA models:?

These univariate models are used to better understand a single time-dependent variable, such as temperature over time, and to predict future data points of variables. These models work on the assumption that the data is stationary. Analysts have to account for and remove as many differences and seasonality in past data points as they can. Thankfully, the ARIMA model includes terms to account for moving averages, seasonal difference operators, and autoregressive terms within the model.

WHAT ARE THE UNDERLYING MODELS USED TO DECOMPOSE THE OBSERVED TIME SERIES?

Decomposition models are typically additive or multiplicative, but can also take other forms such as pseudo-additive.

Additive Decomposition

No alt text provided for this image

The following figure depicts a typically additive series. The underlying level of the series fluctuates but the magnitude of the seasonal spikes remains approximately stable

No alt text provided for this image

Multiplicative Decomposition

No alt text provided for this image

Most of the series analysed by the ABS show characteristics of a multiplicative model. As the underlying level of the series changes, the magnitude of the seasonal fluctuations varies as well.

No alt text provided for this image

Pseudo-Additive Decomposition

No alt text provided for this image

An example of series that requires a pseudo-additive decomposition model is shown below. This model is used as cereal crops are only produced during certain months, with crop production being virtually zero for one quarter each year.

No alt text provided for this image

#timeseries #ARIMA #trend #cycle #timeseriesanalysis #forecasting

Bhavik D.

Specification Engineer at Fosroc Chemicals (India) Private Limited

9 个月

Good article

回复
Sai Giridhar Varanasi

Data Engineering Manager

3 年

Wonderfully written!

Niraj Kumar Sah

Manager - Wind Resource Assessment (WRA)

3 年

Best ??

要查看或添加评论,请登录

Angad Gupta ,MIEEE, BITS-Pilani的更多文章

社区洞察

其他会员也浏览了