Time Series Episode 7: “Darts” with covariates

Vasilis Kalyvas

Senior Data Scientist at Coca-Cola HBC | AI/ML online articles & tutorials

发布日期: 2025年1月26日

+ 关注

Learn how to add external variables in your Darts algorithms

Introduction

Hi there! Happy to see you again in this series of articles, where we discuss about Time Series, theory and examples.

In the previous articles we discussed about ARIMA-family models for forecasting and working examples of how to apply them, based on my experience from multiple projects so far, and also introduced the “Darts” library in Python.

Darts contains lots of forecasting algorithms, making it very easy to do feature engineering and apply and compare easily whatever you want, as well as evaluating results in the end (you can learn more here).

In this story we will apply forecasting algorithms of Darts, showing how to add covariates (a.k.a. “external variables”) and make use of all the knowledge in the dataset.

The goal, however ?? :

Let me highlight that the goal of this story is not to find the model, but instead learn how to distinguish past from future covariates and use them in various models. And also, compare how every model performs with and without covariates.

You ready? Let’s start!

Step-by-Step Working Example

We begin with the wind dataset, described in this hands-on tutorial:

LinkeIn: Time Series Episode 3: ARIMA Forecasting with exogenous variables

Medium: Time Series Episode 3: ARIMA Forecasting with exogenous variables

This dataset contains information about wind speed, rain and temperatures. The goal is to predict the wind speed.

Until now, we had worked on examples that contained only the variable to be forecasted. This is known as “Univariate Time Series forecasting”.

This time, however, we have additional variables that may seem to be helpful and require further analysis. Such problems are known as “Multivariate Time Series forecasting”.

Looking at that story, to get reminded, we have daily measurements (6,574 rows in total) of Wind, Rain, maximum, minimum and ground Temperatures (“T.MAX”, “T.MIN” and “T.MIN.G” respectively), as well as some other indicator variables, which we leave out from our analysis. We also limit our dataset for simplicity, do some basic transformations with DATE and some basic EDA that showed no clear upward or downward trend. The Time Series seems stationary with a constant trend of something like 1 year. Lastly, we saw that wind speed doesn’t seem to have strong correlation with any of these variables.

After investigation of parameters and multiple iterations, we had arrived at this ARIMA model predicting the last 10 days:

SARIMAX(1,0,5)(1,1,0,8)

and the best results were given after selecting only T.MAX and T.MIN.G.

Well, now that we will try more methods than ARIMA, let’s try to forecast more steps ahead, something like 22 days ahead.

Step 1: Read and transform the data

We begin by reading the data as “csv”, transforming it into a special “Timeseries” object required by Darts, plotting the series and checking for seasonality:

# read the data (we have daily data)
df = pd.read_csv("wind_dataset.csv")

# convert month to datetime and set as index
df['DATE'] = pd.to_datetime(df['DATE'])
df = df.set_index('DATE')
df = df.asfreq(pd.infer_freq(df.index))

# slice the data
df = df[df.index >= '1977-01-01']
df = df[['WIND','RAIN','T.MAX','T.MIN','T.MIN.G']]

# transform the Wind into a TimeSeries object, and create training and validation sets
from darts import TimeSeries
df['date'] = df.index
series = TimeSeries.from_dataframe(df, 'date', 'WIND')
train, val = series.split_after(0.97)  # 97% training series and the rest is the validation set

# plot the data
plt.figure(figsize=(12, 5))
train.plot(label='training')
val.plot(label='validation')
plt.legend()

# this returns a tuple (season, m), where season is a boolean value
#       indicating whether the series has seasonality or not, and m is the seasonality of the series:

from darts.utils.statistics import check_seasonality, plot_acf, plot_pacf, plot_residuals_analysis
is_seasonal, mseas = check_seasonality(series, max_lag=150)

print("seasonal? " + str(is_seasonal))
if is_seasonal:
    print('There is seasonality of order {}.'.format(mseas))

Darts provides lots and lots of useful of capabilities for data engineering and statistical tests, along with all the forecasting models.

The “check_seasonality” test tells us that the time series has seasonality of 10 days.

Let’s decompose the time series:

# seasonal decomposition

# import the necessary module
from statsmodels.tsa.seasonal import seasonal_decompose, MSTL

# decompose the time series into its trend, seasonal and residuals components
result_decompose = seasonal_decompose(df['WIND'], model='additive', period=10)
trend     = result_decompose.trend
seasonal  = result_decompose.seasonal
residuals = result_decompose.resid
# plot every component
plt.figure(figsize=(20,10))
plt.subplot(311)
plt.plot(trend)
plt.title('trend')
plt.subplot(312)
plt.plot(seasonal)
plt.title('seasonality')
plt.subplot(313)
plt.plot(residuals)
plt.title('residuals')

Now, let’s define the covariates.

Covariates are the external variables that the model can use, in order to forecast the target time series. In that way, we say that we are dealing with a Multivariate forecasting problem.

In this case, the covariates will be the Rain and Temperature time series.

First step is to make sure there are no nulls, so we impute with average values (for simplicity reasons):

for variable in ['RAIN','T.MAX','T.MIN','T.MIN.G']:
    # replace nulls in 'RAIN' with its average value
    df[variable].fillna(df[variable].mean(), inplace=True)

Second step is to transform each of them into a Darts TimeSeries object (necessary), to be utilized by the models and create yearly/monthly/daily additional features (optional) that could guide the models understand the temporal patterns better:

# create covariates series based on additional predictors
rain_series    = TimeSeries.from_dataframe(df, "date", 'RAIN')
t_max_series   = TimeSeries.from_dataframe(df, "date", 'T.MAX')
t_min_series   = TimeSeries.from_dataframe(df, "date", 'T.MIN')
t_min_g_series = TimeSeries.from_dataframe(df, "date", 'T.MIN.G')


# create temporal covariates series ( + daily volume)
from darts.utils.timeseries_generation import datetime_attribute_timeseries

# create month and year covariate series
year_series = datetime_attribute_timeseries(
                pd.date_range(start=series.start_time(), end=series.end_time(), freq=series.freq_str),
                attribute="year",
                one_hot=False)

year_series   = Scaler().fit_transform(year_series)

month_series  = datetime_attribute_timeseries(year_series, attribute="month", one_hot=True, dtype='float32')

day_series    = datetime_attribute_timeseries(year_series, attribute="weekday", one_hot=True, dtype='float32')

Lastly, we need to distinguish the past from the future covariates, which is very important for Darts.

Past covariates are those variables that are known only in the past. Let’s accept here the T.MAX, T.MIN and T.MIN.G as the past covariates, because they show the minimum and maximum values that happened.

Future covariates are the variables whose values are known also in the future. Let’s accept here that RAIN is known in the future (e.g. we have daily rain forecasts), as well as month and day, of course.

Then we stack together the past covariates, and then the future ones (there is a better way, but that may be more intuitive):

# create the past covariates, that are known only in the past (we think as T.MAX, T.MIN and T.MIN.G as known only in the past)
past_covariates = t_max_series
past_covariates = past_covariates.stack(t_min_series)
past_covariates = past_covariates.stack(t_min_g_series)
past_covariates = past_covariates.astype(np.float32)

# create the future covariates, that are known also in the future (we accept that RAIN is known in the future, as well as month and day, of course)
future_covariates = rain_series
future_covariates = future_covariates.stack(month_series)
future_covariates = future_covariates.stack(day_series)
future_covariates = future_covariates.astype(np.float32)

It is important that not all models accept the same types of covariates. Some of them accept only past, some only future and some both. That’s why it is very important to know and define beforehand what type every covariate is, based on experience and domain knowledge, in order to utilize it better and more accurately.

For a guidance of what type every model accept, please refer to the official documentation:

Covariates - darts documentation

Step 2: Train multiple models

We set the validation test to be 22 days, that’s the amount of steps that are going to be forecasted.

2.1 Simple Exponential Smoothing (SES)

Let’s start with SES, that considers most recent past data:

# these models don't accept covariates, so we will use the series without covariates

model_ses = ExponentialSmoothing(seasonal_periods=10)
model_ses.fit(train)
predictions_ses = model_ses.predict(len(val), num_samples=1000)

2.2 Theta model

We also train a Theta model, which is similar to SES but applying the two theta lines to consider general seasonality and recent trend:

# these models don't accept covariates, so we will use the series without covariates

model_theta = Theta()
model_theta.fit(train)
predictions_theta = model_theta.predict(len(val))

2.3 Linear Regression

Although more suitable for tabular datasets, we also try Linear Regression by looking at some lags of the Time Series as the predictors.

As we said in the previous article, you should pay attention to the parameter “output_chunck_length” which is the number of time steps predicted at once (per chunk) by the internal model.

Basically, this means how many months are being predicted with every model training. So, in the following code snippet, in the training we use 1- lags (i.e. the last 10 days) to predict the next 5 (output_chunk_length = 5). Actually, this means that the first 5 forecasts will then be used to predict the next 5 time steps and so on, until we reach the 22 days of validation set. So we try both without and with covariates:

# without covariates
model_LR = LinearRegressionModel(
            lags=10,
            output_chunk_length=5)
            
model_LR.fit(train)
predictions_LR = model_LR.predict(len(val))

#---------------------------

# with covariates
model_LR_cov = LinearRegressionModel(
            lags=10,
            output_chunk_length=5,
            lags_past_covariates=10,
            lags_future_covariates=[0,1,2,3,4])
            
model_LR_cov.fit(train, 
                future_covariates=future_covariates, 
                past_covariates=past_covariates)

predictions_LR_cov = model_LR_cov.predict(len(val), 
                                        future_covariates=future_covariates, 
                                        past_covariates=past_covariates)

Pay attention to the covariates’ code:

We define the lags, which must be the same as the “lags_past_covariates” parameter, so as to consider the last 10 values of both the target and the covariates’ time series.

Also, “lags_future_covariates” looks at the future steps 0 until 4 of the future covariates time series, so in total 5 steps ahead, the same count as the “output_chunk_lenth”.

So think it like the past covariates go hand-in-hand with the lags, whereas the future covariates with the target.

The linear Regression is actually a type of model that accepts both ast and future covariates.

2.4 AutoARIMA

Now we will try SARIMA models, based on the analysis and ACF/PACF plots that we constructed in the original story, and we will utilize Dart’s AutoARIMA both with and without covariates (these models actually accept only future covariates):

# without covariates
model_arima = AutoARIMA(
                      m=10,                   # frequency of series                      
                      seasonal=True,          # TRUE if seasonal series
                      test='adf',             # use adftest to find optimal 'd'
                      d=1,                    # let model determine 'd'
                      start_p=1, start_q=1,   # minimum p and q
                      max_p=6, max_q=2,       # maximum p and q
                      D=1,                    # let model determine 'D'
                      start_P=0, start_Q=0,   # minimum p and q
                      max_P=1, max_Q=1,       # maximum p and q
                      trace=True,
                      error_action='ignore',  
                      suppress_warnings=True, 
                      stepwise=True
)

model_arima.fit(train)
predictions_arima = model_arima.predict(len(val))

#---------------------------

# with covariates (only future)
model_arima_cov = AutoARIMA(
                      m=10,                   # frequency of series                      
                      seasonal=True,          # TRUE if seasonal series
                      test='adf',             # use adftest to find optimal 'd'
                      d=1,                    # let model determine 'd'
                      start_p=1, start_q=1,   # minimum p and q
                      max_p=6, max_q=2,       # maximum p and q
                      D=1,                    # let model determine 'D'
                      start_P=0, start_Q=0,   # minimum p and q
                      max_P=1, max_Q=1,       # maximum p and q
                      trace=True,
                      error_action='ignore',  
                      suppress_warnings=True, 
                      stepwise=True
)

model_arima_cov.fit(train, future_covariates = future_covariates)
predictions_arima_cov = model_arima_cov.predict(len(val), future_covariates = future_covariates)

Now let’s evaluate what we have so far from the above models:

# evaluate results with MAPE
from darts.metrics import mape
print('MAPE:')
print('Exponential smoothing: ', np.round(mape(predictions_ses, val),2), '%')
print('Autoarima: ', np.round(mape(predictions_arima, val),2), '%')
print('Autoarima (with covariates): ', np.round(mape(predictions_arima_cov, val),2), '%')
print('Theta: ', np.round(mape(predictions_theta, val),2), '%')
print('Linear Regression: ', np.round(mape(predictions_LR, val),2), '%')
print('Linear Regression (with covariates): ', np.round(mape(predictions_LR_cov, val),2), '%')

plt.figure(figsize=(25,10))
val.plot(label='actual', lw=5)
predictions_ses.plot(label='forecast_exponential', lw=2)
predictions_arima.plot(label='forecast_autoarima', lw=2)
predictions_arima_cov.plot(label='forecast_autoarima_cov', lw=2)
predictions_theta.plot(label='forecast_theta', lw=2)
predictions_LR.plot(label='forecast_LR', lw=2)
predictions_LR_cov.plot(label='forecast_LR_cov', lw=2)
plt.legend()

MAPE:
- Exponential smoothing:  39.6 %
- Autoarima:  44.43 %
- Autoarima (with covariates):  75.99 %
- Theta:  36.63 %
- Linear Regression:  41.79 %
- Linear Regression (with covariates):  39.31 %

Image 3 — SES vs Theta vs ARIMA vs Linear Regression

Blue line is the actuals, while the orange shaded area actually refers to the boundaries of the SES model.

We see interestingly that best results so far are from the Theta model, which has no covariates at all. Alright, good to know!

领英推荐

pyOpenSci Newsletter: June 2024

pyOpenSci 9 个月前

Embedding Interactive Python Plots, Improving Your…

Open Data Science Conference (ODSC) 2 年前

Logistic Regression: From the ground up

Vizuara 10 个月前

2.5 RNN

We discussed and explained RNNs and their parameters’ meaning in the previous article, so you can refer to that to get a brief understanding.

The main parameters here:

training_length: number of data points (months) to be used for training. It is equal to the length of both input and output data points. Here we set it to 20.
input_chunk_length: the number of past data points (days) to be used for predictions. Here, the RNN will look back 10 days in the past from the moment of prediction. If we increase the value, then we force the RNN to rely more on its long-term memory. However, this parameter should not exceed the previous one (training_length).
n_epochs: the number of epochs over which to train the model, set here at 200 to decrease the error (but don’t overdo it because the model will overfit).
batch_size: the number of time series (input and output sequences) used in each training pass, set here at 64.

First we need to transform the data to scale of 0–1 for neural networks to work best, and we will inverse-transform the predictions:

# for Neural Networks, we need to transform the data, both the training and validation sets
from darts.dataprocessing.transformers import Scaler

# create a Scaler object, fit it on the training data and transform all subsets
main_transformer = Scaler()
train_transformed = main_transformer.fit_transform(train)
val_transformed = main_transformer.transform(val)

Now let’s train an RNN with LSTM without covariates first:

# RNN without covariates

# define the model parameters
model_rnn = RNNModel(
    model="LSTM",
    training_length=20,
    input_chunk_length=10,
    n_epochs=200,
    batch_size=64,
    optimizer_kwargs={"lr": 1e-3}
)

# train the model
model_rnn.fit(
    train_transformed,
    verbose=True)

# make predictions
predictions_rnn = model_rnn.predict(len(val))

# bring the scaled predictions back to original scale
predictions_rnn = main_transformer.inverse_transform(predictions_rnn)

print('\n')
print('MAPE:')
print('RNN: ', np.round(mape(predictions_rnn, val),2), '%')

plt.figure(figsize=(15,5))
val.plot(label='actual')
predictions_rnn.plot(label='forecast_rnn', lw=3)
plt.legend()

Straight line, doesn’t look good but ok.

Maybe covariates will help, and actually only the fitire ones are accepted here.

But first we must also scale them:

#  transform the past covariates series
transformer_past_cov = Scaler()
past_covariates_transformed = transformer_past_cov.fit_transform(past_covariates)

# also transform the future covariates series
transformer_fut_cov = Scaler()
future_covariates_transformed = transformer_fut_cov.fit_transform(future_covariates)

# also transform the total covariates series
transformer_all_cov = Scaler()
all_covariates_transformed = transformer_all_cov.fit_transform(all_covariates)

and then train the RNN:

# RNN with covariates

# define the model parameters
model_rnn_cov = RNNModel(
    model="LSTM",
    training_length=20,
    input_chunk_length=10,
    n_epochs=200,
    batch_size=64,
    optimizer_kwargs={"lr": 1e-3})

# train the model (it accepts only future covariates)
model_rnn_cov.fit(
    train_transformed,
    future_covariates = future_covariates_transformed,
    verbose=True)

# make predictions
predictions_rnn_cv = model_rnn_cov.predict(
                            len(val), 
                            future_covariates = future_covariates_transformed)

# bring the scaled predictions back to original scale
predictions_rnn_cv = main_transformer.inverse_transform(predictions_rnn_cv)

print('\n')
print('MAPE:')
print('RNN: ', np.round(mape(predictions_rnn_cv, val),2), '%')

plt.figure(figsize=(15,5))
val.plot(label='actual')
predictions_rnn_cv.plot(label='forecast_rnn_cv', lw=2)
plt.legend()

MAPE of RNNs:
- without covariates:  38.33 %
- with covariates:     92.90 %

Interesting, seems trying to follow the pattern but not good MAPE at all, because of the large errors. The average line seemed to give a better MAP, after all.

I am sure hyperparameters are not optimal and need to be configured more, but that’s not the goal of this story.

The goal is to show how to use the covariates, as we said, and compare performance of model with and without them, so let’s move forward and maybe we will deal with that in a future story, who knows!

2.6 NBEATS

Another neural network approach, designed for time series forecasting. They use a stack of fully connected layers (MLPs) that learn different aspects of the time series, such as trend and seasonality.

As mentioned in previous story, N-BEATS (standing for “Neural Basis Expansion Analysis Time Series”), it uses a stack of fully connected layers (MLPs) organized into blocks. Each block learns different aspects of the time series data, such as trends, seasonality, or other patterns.

Let’s train a basic NBEATS model:

from darts.models import NBEATSModel

model_nbeats = NBEATSModel(
                    input_chunk_length=len(val)*4, 
                    output_chunk_length=len(val),
                    n_epochs=200)

# train the model without covariates
model_nbeats.fit(train)
predictions_nbeats = model_nbeats.predict(len(val))

print('NBEATS: ', np.round(mape(predictions_nbeats, val),2), '%')
plt.figure(figsize=(15,5))
val.plot(label='actual', lw=5)
predictions_nbeats.plot(label='forecast_NBEATS', lw=2)
plt.legend()

Ok, looks good. What about using (past only) covariates?

from darts.models import NBEATSModel

model_nbeats_cov = NBEATSModel(
                    input_chunk_length=len(val)*4, 
                    output_chunk_length=len(val),
                    n_epochs=200)

# train the model with past covariates only 
model_nbeats_cov.fit(train, past_covariates = past_covariates)
predictions_nbeats_cov = model_nbeats_cov.predict(len(val), past_covariates = past_covariates)

print('NBEATS with cov: ', np.round(mape(predictions_nbeats_cov, val),2), '%')
plt.figure(figsize=(15,5))
val.plot(label='actual', lw=5)
predictions_nbeats.plot(label='forecast_NBEATS_cov', lw=2)
plt.legend()

MAPE of NBEATS:
- without covariates:  33.09 %  
- with covariates:     28.34 %

Even better!

In the case of NBEATS, past covariates really helped, and actually make it the best model so far.

2.7 NHiTS

Another neural network approach, which was also discussed in previous article.

N-HiTS (“Neural Hierarchical Interpolation for Time Series”) is an improvement over N-BEATS, that introduces a hierarchical decomposition strategy, where forecasts are generated at different scales or resolutions. This helps in capturing both short-term and long-term patterns in the data more effectively.

It uses a combination of interpolation and extrapolation techniques to better estimate future values of a time series. This approach allows N-HiTS to handle missing data and make robust forecasts over different time horizons.

Like N-BEATS, N-HiTS decomposes the time series into different components (such as trend, seasonality, and residuals), but does so in a more structured hierarchical manner, which helps capture the multi-scale nature of time series data. As such, it gets trained on fewer parameters and is subsequently faster than N-BEATS.

Start with a basic NHiTS model:

from darts.models import NHiTSModel

# train the model with past covariates only
model_nhits = NHiTSModel(
                    input_chunk_length=len(val)*4, 
                    output_chunk_length=len(val),
                    n_epochs=200)

# train the model without covariates
model_nhits.fit(train)
predictions_nhits = model_nhits.predict(len(val))

print('NHiTS: ', np.round(mape(predictions_nhits, val),2), '%')
plt.figure(figsize=(15,5))
val.plot(label='actual', lw=5)
predictions_nhits.plot(label='forecast_NHiTS', lw=2)
plt.legend()

Seems good, and was actually than the NBEATS.

Now, with past-only covariates:

from darts.models import NHiTSModel

model_nhits_cov = NHiTSModel(
                    input_chunk_length=len(val)*4, 
                    output_chunk_length=len(val),
                    n_epochs=200)

# train the model with past covariates only
model_nhits_cov.fit(train, past_covariates = past_covariates)
predictions_nhits_cov = model_nhits_cov.predict(len(val), past_covariates=past_covariates)

print('NHiTS with cov: ', np.round(mape(predictions_nhits_cov, val),2), '%')
plt.figure(figsize=(15,5))
val.plot(label='actual', lw=5)
predictions_nhits_cov.plot(label='forecast_NHiTS_cov', lw=2)
plt.legend()

MAPE of NHiTS:
- without covariates:  31.88 %
- with covariates:     32.85 %

Results were not improved this time with use of covariates.

2.8 TiDE

This is the first time we are experimenting with a such a model.

TiDE is similar to Transformers, but attempts to provide better performance at lower computational cost by introducing multilayer perceptron (MLP)-based encoder-decoders without attention.

This model supports past covariates (known for “input_chunk_length” points before prediction time), future covariates (known for “output_chunk_length” points after prediction time), static covariates, as well as probabilistic forecasting. Here we have only past and future covariates to deal with.

The encoder and decoder are implemented as a series of residual blocks. The number of residual blocks in the encoder and decoder can be controlled via num_encoder_layers and num_decoder_layersrespectively. The width of the layers in the residual blocks can be controlled via hidden_size. Similarly, the width of the layers in the temporal decoder can be controlled via temporal_decoder_hidden. For simplicity reasons, however, we leave these hyperparameters in their default values.

Train a basic model:

from darts.models import TiDEModel

model_tide = TiDEModel(
                    input_chunk_length=len(val)*4, 
                    output_chunk_length=len(val),
                    n_epochs=200
                    )

# train the model with transformed data
model_tide.fit(train_transformed)

# create the predictions
predictions_tide = model_tide.predict(len(val))

# bring the scaled predictions back to original scale
predictions_tide = main_transformer.inverse_transform(predictions_tide)

print('TiDE: ', np.round(mape(predictions_tide, val),2), '%')
plt.figure(figsize=(15,5))
val.plot(label='actual', lw=5)
predictions_tide.plot(label='forecast_TiDE', lw=2)
plt.legend()

And now with both past and future covariates:

from darts.models import TiDEModel

model_tide = TiDEModel(
                    input_chunk_length=len(val)*4, 
                    output_chunk_length=len(val),
                    n_epochs=200,
                    # num_encoder_layers=2,
                    # num_decoder_layers=2,
                    )

# train the model with transformed data
model_tide.fit(train_transformed, 
               past_covariates = past_covariates_transformed, 
               future_covariates=future_covariates_transformed)

# create the predictions
predictions_tide = model_tide.predict(len(val), 
                                      past_covariates=past_covariates_transformed, 
                                      future_covariates=future_covariates_transformed)

# bring the scaled predictions back to original scale
predictions_tide = main_transformer.inverse_transform(predictions_tide)

print('TiDE: ', np.round(mape(predictions_tide, val),2), '%')
plt.figure(figsize=(15,5))
val.plot(label='actual', lw=5)
predictions_tide.plot(label='forecast_TiDE', lw=2)
plt.legend()

MAPE of TiDE:
- without covariates:  55.21 %
- with covariates:    188.63 %

Weeeeell, covariates for sure didn’t help! ?? But it’s ok, we are just experimenting here.

Step 3: Evaluate results

Best results so far were given by using NBEATS with past covariates. The forecasts follow the actuals’ trendline and had the smaller MAPE.

Conclusion

In this article we experimented with multiple models from the Darts library in Python, trying to forecast a simple Time Series dataset.

The real goal was not tofind the best model. Instead the goal was learning how to utilize past and future external variables, named “covariates” for our advantage and compare each model’s performance with and without them.

Some things to keep in mind before you go:

Darts provides an easy-to-use framework to transform your Time Series and apply multiple models, ranging from statistical ones up to neural networks, as well as easy utilization of covariates.
It is very important to distinguish which covariates are known only in the past and which have values also known in the future. Usually they represent forecasts or temporal information. It is pretty clear that, depending on what we set as past or future covariate, can make a great difference in the forecasts, and they require experience and domain knowledge.
The covariates don’t always help, sometimes they lead to worse performance.
Some models accept only past covariates, some only future and some both. Know your model and its inner workings.

Results, for example, would have been different provided we selected different combinations of past and future covariates, and different model’s hyperparameters. So, once again, you need to experiment more and more to improve your model as much as you can.

I am also using this library in my day-to-day work for Time Series projects and exploring its potential for data handling and modeling, so I am learning with you along the way!

You can also read my original article published on Medium :

Time Series Episode 7: “Darts” with covariates

Feel free to connect with me on LinkedIn and Kaggle.

Thanks for reading!

要查看或添加评论，请登录

Vasilis Kalyvas的更多文章

The easiest AI agent you will ever create!

2025年1月30日

The easiest AI agent you will ever create!

Can it get simpler than this?? Introduction Make it simple, they say. And that’s what I am about to do.

4 条评论
Time Series Episode 6: Battle of forecasting algorithms in “Darts”

2024年10月8日

Time Series Episode 6: Battle of forecasting algorithms in “Darts”

Introduction Hi there! Happy to see you again in this series of articles, where we discuss about Time Series, theory…
Time Series Episode 5: Getting Started with “Darts”

2024年6月15日

Time Series Episode 5: Getting Started with “Darts”

Introduction Hi there! Happy to see you again in this series of articles, where we discuss about Time Series, theory…
Time Series Episode 4: Can you trust Auto-ARIMA?

2024年6月1日

Time Series Episode 4: Can you trust Auto-ARIMA?

Introduction Hi there! Happy to see you again in this series of articles, where we discuss about Time Series, theory…
Time Series Episode 3: ARIMA Forecasting with exogenous variables

2024年5月18日

Time Series Episode 3: ARIMA Forecasting with exogenous variables

Introduction Hi there! Happy to see you again in this series of articles, where we discuss about Time Series, theory…
Time Series Episode 2: What happens with strong seasonality

2024年5月5日

Time Series Episode 2: What happens with strong seasonality

Introduction Hi there! Happy to see you again in this series of articles, where we discuss about Time Series, theory…
Time Series Episode 1: How to select the correct SARIMA parameters

2024年4月27日

Time Series Episode 1: How to select the correct SARIMA parameters

Introduction Hi there! Happy to see you again in this series of articles, where we discuss about Time Series, theory…
Time Series Episode 0: Familiarize with ARIMA and its parameters

2024年4月21日

Time Series Episode 0: Familiarize with ARIMA and its parameters

Introduction A Time Series is a series of data points ordered in time. Simple as that.

See all articles

Time Series Episode 7: “Darts” with covariates

Vasilis Kalyvas

Senior Data Scientist at Coca-Cola HBC | AI/ML online articles & tutorials

Learn how to add external variables in your Darts algorithms

Introduction

Step-by-Step Working Example

Step 1: Read and transform the data

Step 2: Train multiple models

2.1 Simple Exponential Smoothing (SES)

2.2 Theta model

2.3 Linear Regression

2.4 AutoARIMA

领英推荐

2.5 RNN

2.6 NBEATS

2.7 NHiTS

2.8 TiDE

Step 3: Evaluate results

Conclusion

Vasilis Kalyvas的更多文章

社区洞察

其他会员也浏览了

Kalman filters, Natufian, and grilled lamb (convo w/Meta.ai)

Lie Algebra on SO3 Groups in Python

HOW TO CREATE A COMPUTER VISION DATASET FROM VIDEO IN R

?? Mathematical Analysis Meets Python: From Theory to Computation ??

New Perspective on the Riemann Hypothesis

Data Science #20

Einstein Summation in Numpy

Vector Databases Demystified: Part 2 - Building Your Own (Very) Simple Vector Database in Python

Framework Python Machine Learning

Data science and machine learning in petroleum geostatistics - Part 2

Learn how to add external variables in your Darts algorithms

Introduction

Step-by-Step Working Example

Step 1: Read and transform the data

Step 2: Train multiple models

2.1 Simple Exponential Smoothing (SES)

2.2 Theta model

2.3 Linear Regression

2.4 AutoARIMA

领英推荐

2.5 RNN

2.6 NBEATS

2.7 NHiTS

2.8 TiDE

Step 3: Evaluate results

Conclusion

Vasilis Kalyvas的更多文章

The easiest AI agent you will ever create!

Time Series Episode 6: Battle of forecasting algorithms in “Darts”

Time Series Episode 5: Getting Started with “Darts”

Time Series Episode 4: Can you trust Auto-ARIMA?

Time Series Episode 3: ARIMA Forecasting with exogenous variables

Time Series Episode 2: What happens with strong seasonality

Time Series Episode 1: How to select the correct SARIMA parameters

Time Series Episode 0: Familiarize with ARIMA and its parameters

社区洞察

其他会员也浏览了

Kalman filters, Natufian, and grilled lamb (convo w/Meta.ai)

Lie Algebra on SO3 Groups in Python

HOW TO CREATE A COMPUTER VISION DATASET FROM VIDEO IN R

?? Mathematical Analysis Meets Python: From Theory to Computation ??

New Perspective on the Riemann Hypothesis

Data Science #20

Einstein Summation in Numpy

Vector Databases Demystified: Part 2 - Building Your Own (Very) Simple Vector Database in Python

Framework Python Machine Learning

Data science and machine learning in petroleum geostatistics - Part 2