Comparing two univariate noisy time series of same length and sampling time
Gustavo Sánchez Hurtado
Award-Winning Engineer, Researcher & Educator | Digital Transformation: Control Systems, IoT, and Machine Learning | PLC/SCADA programmer | Python/MATLAB | Node Red | Global Speaker, Author & Podcaster
A very common problem in practice is this: given two univariate noisy time series of same length and sampling time, we need two decide if they come from the same stochastic process or not. For example, consider the two series showed in the figure above. They correspond to 100 values, from t = 0 to t = 10s, so the sampling frequency is 10Hz. They seem to have the same average, near 0. Series y2 seems to reach higher max and lower min values. Let's take a look of their basic statistics.
We confirm that both series have mean near 0, but y2 has a greater variance. Now let's take a look of their distribution.
Again we can see that y2 seems to have a broader distribution compared to y1. Although their distribution does not look Gaussian, we may perform an F-test, to compare the variances.
Given the p-values, we have more clues that these two series do not come from the same stochastic process.
Let's take a look of they spectrum.
We can see that y2 is having a peak which is greater in frequency and amplitude compared to y1.
The R script for this example is available at:
https://github.com/multiopti/MYWAI/blob/main/comparing2series.R
Do you know a better approach to solve this problem? Do you have a counterexample in which this method does not work? Do you have any general comment about this article?
I would be happy to receive your comments to: [email protected]
At?MYWAI?we promote agile, explainable, reliable and affordable ML at the edge.
Risk Analyst
2 年Hi Prof...It depends on what you want to study from the two series.. IMHO 1. First of all I presume, the both series are stationary. In that case you can find the optimum ARMA model (individually) by using the auto.arima function in R. Or you can write a short program in R, iterating through to AR=12, MA=12 to find the best combination, while minimising the information ratio (either AIC or BIC). Conduct the residual test for serial correlation. ARCH may be present, but that is not an issue (because volatility is not studied here). The residuals should be stationary. Once you test the robustness, you can conduct forecasting on these models. 2. Second, if they are not stationary, but their first difference (both) is stationary, you can conduct cointegration as they are integrated at I(1). Essentially that means that there is a unit root, hence cointegration test can be done. 3. If their first differences are stationary, you can look at Vector Autoregression (VAR) to model the two series to understand causality. 4. If one of the series is stationary whereas the other is not, you can use ARDL . 5. You can use Multivariate GARCH models to study the interdependencies with respect to volatilities. Hope the above helps.
Product Marketing Director - FICO Xpress Optimization | Decision Scientist
2 年Interesting article and approach, thanks for sharing. You might also consider running a cointegration test (e.g., Engle-Granger, Johansen, etc.)
BITS RMIT Cotutelle Ph.D. Researcher | Ex - Intel Corporation | DTU'23
2 年Very good to see time series result! You may expand your results by using Singular Spectrum Analysis