登录查看更多内容

Simple Way to Compute Time Series Rolling Average/Trend Predition over Time with Laplace Transform

Zhenyu Shi

Software Engineer at Google

发布日期: 2020年2月3日

Introduction

In marketing and many other business and industry senainos, data scientist or analysts need to generate rolling averaege or predict trend for a time series values. For example, we may be asked to estimate average of sales for thr "current" time point and the model needs to "forget" the old history. Here I would like to introduce a very interesting way (to compute the moving average that can "forget" old data by using the concept of Laplace transform. Don't get terrified as the calculation is very simple. It also reduces the dimension to 1 in case we regularly need to roll over to the average over a time span to the next time point. This approach also helps explains the meaning of the Laplace transform variable s as the period T. This method has been applied in a few of my projects.

The Formula

To make it simple:

In the formular: x[i] is the data values, t[i] is the time point. T is the period that you want calculate average over. We can also try to understand the formula as sum of values that decays over time exponentially (i.e. after time 1/T, the original value x[i] will become x[i]/e), interestingly, which sum multiplied by T, is the approximate moving average (I will demostrate it later with Laplace transform).

The Good Feature - Simple Rolling Over Computation

It can compute the next average by manipulating the previous value without querying the values in that time window of T. See below:

Here W is the length of time span. W does not need to be equal to T. It is a lot easier to understand it in the way of "sum of values decaying over time", where we simply decay the current average by exp(-TW) and add the values in the current time span of W.

Code Example

To make you understand it better, let's try a simple peice of Python code.

import numpy as np


k = 50000
arr = np.ones(k)


s1 = 1
s2 = 3
s3 = 6
s4 = 24


at1 = s1 * np.sum(arr * np.exp(-np.arange(0, k)*s1))
at2 = s2 * np.sum(arr * np.exp(-np.arange(0, k)*s2))
at3 = s3 * np.sum(arr * np.exp(-np.arange(0, k)*s3))
at4 = s4 * np.sum(arr * np.exp(-np.arange(0, k)*s4))


print('average over 1 day:', at1)
print('average over 3 day:', at2)
print('average over 6 day:', at3)
print('average over 24 day:', at4)


# average over 1 day: 1.5819767068693265
# average over 3 day: 3.1571870894737675
# average over 6 day: 6.014909469941067
# average over 24 day: 24.000000000906034

Where you can see, this does work when s is greater (especially for s4 = 24, the result is very close to 24). What's the reason? It's because the data we feeded into the model is discrete points, but we are trying to compute a smooth average over time (take this as Question A, I will explain it later with Laplace transform). But let's try another piece of code, you will find that it actually works very well as below:

k = 50000
arr = np.ones(k)
s1 = 1
at1 = s1 * np.sum(arr/100 * np.exp(-np.arange(0, k)/100*s1))
print('average over 1 day:', at1)
# average over 1 day: 1.0050083333194446

Why? We need to ask ourselves about what is actually a data "POINT"? See the figure below, when there is a value Xi given at a specific time "POINT" t[i], mathmetically, we can consider it as a Dirac delta function/distribution (a distribution reaches positive infinity in Y axis where t -> t[i] but 0 where t != t[i], and has exactly integral area of x[i]. When we compute the average of them, we can image that we are using two unit step functions to generate a platform to simulate the average. For the figure blow, we have 3 Dirac impulses of area 1, and we use a two unit step functions u(t-1)-u(t-3) to present the flatten platform with height of 1 and width of 3.

The Math - Laplace Transform of Dirac Impulses and Unit Step Functions

Now one interesting thing is that we have reach the real physical explanation of the Laplace transform variable "s". What does "s" mean? In Fourier transform, we know the variable is frequency. However, in Laplace transform, most textbook does not clearly explain the physical meaning of the variable "s". Here we can find that "s" is actually the time period. If s is T, you will get the average of x[i] over the "Period" of T.

That's how we can reach the formula with simplicity.

Explanation to Question A:

When T is very small (comparable to the size of time gap between adjacent data points). In the first code example, when s1 = 1, why the output 1.582 is significantly greater than the "real average of 1"? That's because there is a Dirac impulse exactly at the time point of 0, and the closest one to it is at the time point of -1. Therefore, in the "decaying" model, we know that when the the area (of Dirac impulse) are all at time point 0, it has more contribution than the area evenly spreaded between 0 and -1 (i.e. a square of 1 x 1). That's why the value from this algorithm is 1.582. In fact, it truely reflected the effect of "decaying over time". When T is greater, it will round this effect over larget time window and becomes more immune to local changes.

This method does very good job in computing average time and trend prediction. Check out the code here for example. The blue and green dashes almost overlaps, where the blue is the results of the current Laplace method, while the green the moving average (sum of values in period divided by period).

Limitations

However, there is some limitation if you want to compare it to the real moving average when there is sine wave component in the values. This algorithm seems to amplify the sine wave when the T is over 1/2 of the period of the sine wave.

Simple Way to Compute Time Series Rolling Average/Trend Predition over Time with Laplace Transform

Zhenyu Shi

Software Engineer at Google

Introduction

The Formula

The Good Feature - Simple Rolling Over Computation

Code Example

The Math - Laplace Transform of Dirac Impulses and Unit Step Functions

Explanation to Question A:

Limitations

更多精彩文章

社区洞察

其他会员也浏览了

Data Structures and Algorithms

K Nearest Neighbors

Condensed Nearest Neighbor Rule Undersampling (CNN)

Extra Tree Classifier / Regressor

Let the Data Work for You: thefuzz (fuzzywuzzy)

Using Cross-Correlation to find complex patterns in time-series data

Stock Market Prediction Using Machine Learning

Implementing Polynomial Fitting with numpy.polyfit

Time Series Vectors in Neo4j

Introduction

The Formula

The Good Feature - Simple Rolling Over Computation

Code Example

The Math - Laplace Transform of Dirac Impulses and Unit Step Functions

Explanation to Question A:

Limitations

Technologie that Create Humanities, are the True Economics! What Shall be Done Next?

2021年1月5日

Some Ideas to Help You Understand Coronavirus disease (COVID-19)

2020年3月9日

What is actually Big Data + Deep Learning? Artificial Intelligence or Artificial Stupidity?

2018年10月21日

社区洞察

其他会员也浏览了

Data Structures and Algorithms

K Nearest Neighbors

Condensed Nearest Neighbor Rule Undersampling (CNN)

Extra Tree Classifier / Regressor

Let the Data Work for You: thefuzz (fuzzywuzzy)

Using Cross-Correlation to find complex patterns in time-series data

Stock Market Prediction Using Machine Learning

Implementing Polynomial Fitting with numpy.polyfit

Time Series Vectors in Neo4j