Volatility Modeling - journey from ARCH to NN and MCMC

Volatility Modeling - journey from ARCH to NN and MCMC

Part 1 / 2

[Note: If you'd like full text and full code, go to: Quant Journey with Code | Jakub | Substack]

What is volatility?

Volatility is important in the world of finance. It's like a weathervane that shows investors how stormy or calm the market is. In simple terms, it's about how much the price of something like a stock or a currency goes up and down. If prices jump around a lot, we say there's high volatility. This is important because it helps people decide whether to buy or sell, and it's a key part of many financial decisions.

As our world becomes more connected, things in the financial markets can change quickly and unpredictably. That's why understanding volatility is so important. It's like a measure of how risky the market is. Volatility is at the heart of how we price things and how we handle risks.

In the past, experts have used certain methods, like the ARCH and GARCH models, to try and predict how volatile the market will be. These methods are like weather forecasts for the market, helping people guess how choppy the financial markets might be. But they're not perfect. Sometimes they don't quite catch the market's twists and turns, especially when things get wild.

Recently, because of innovative technology and more advanced computer programs, there's been a fresh push to get better at predicting volatility. It's about combining old-school finance knowledge with recent technology to make better predictions. This isn't about guessing numbers; it's about really understanding the risks and being ready for whatever the market might do next.

Modeling volatility

In modeling the volatility of stock time series data, especially with models like ARCH and GARCH, the main goal is often to accurately capture and predict volatility dynamics, not necessarily to find the model with the best out-of-sample predictions. The model selection criterion you choose should reflect this goal, balancing goodness of fit with model complexity to avoid overfitting.

Here are some considerations for choosing the right criterion in this context:

  1. Akaike Information Criterion (AIC): Use if: Use it if you prioritize goodness of fit and want the model to capture as much of the data's complexities as possible. AIC can be more prone to overfitting since it doesn't penalize model complexity as heavily as BIC, but it might be more sensitive to capturing subtle structures in the volatility.
  2. Bayesian Information Criterion (BIC): Use if: You are dealing with larger datasets and are concerned about overfitting. BIC is preferred when the primary goal is to select the correct model. Since it penalizes model complexity more heavily than AIC, it may be more robust when you have a lot of data and many model candidates.
  3. Hannan-Quinn Information Criterion (HQIC): Use if: You want a balance between the penalty on model complexity of AIC and BIC. It's not as commonly used as AIC or BIC but can be a good middle ground.
  4. Cross-Validation: Use if: Your primary goal is predictive accuracy on new, unseen data. While more computationally intensive, time-series cross-validation methods (like rolling forecast origin or expanding window) can provide a robust estimate of a model's out-of-sample predictive performance.

For financial time series, particularly when the data size is substantial and the risk of overfitting is a concern, BIC is often a preferred choice. It balances the model's complexity against its explanatory power, providing a safeguard against overfitting by introducing a stricter penalty for the number of parameters in the model. This is crucial for models like ARCH and GARCH, where the inclusion of additional lags can quickly increase model complexity

The code from this post is at: https://github.com/jpolec/quantjourney

First, let’s get data and volatility

I have used yfinance and Apple returns for this post.

stocks = 'AAPL'
start = datetime.datetime(2012,1,1)
end = datetime.datetime(2023,1,1)
dax = yf.download(stocks, start, end, interval='1d')

ret = 100 * (dax.pct_change()[1:]["Adj Close"])
realized_vol = ret.rolling(5).std()

plt.figure(figsize=(12, 8))
plt.plot(realized_vol)
plt.show()        

So, the output looks like:

Let’s code ARCH model

The autoregressive conditional heteroskedasticity (ARCH) model, introduced by Robert Engle, is a tool in financial econometrics for modeling and predicting the volatility of time series data, like stock returns. And heteroskedasticity refers to the presence of non-constant variance in the error terms of a regression model over time. In simpler terms, it implies that the spread or "volatility" of a time series is not uniform across time. This is a common feature in financial time series, where periods of high volatility (large fluctuations in asset prices) and periods of low volatility (minimal fluctuations) often occur in clusters, sometimes called volatility clustering.

The ARCH model's ability to model time-varying volatility makes it a valuable tool for risk management and option pricing in financial markets.

Let’s get our code first:

n = (datetime.datetime.strptime('2023/1/1', "%Y/%m/%d") - datetime.datetime.strptime('2020/1/1', "%Y/%m/%d")).days
split_date = ret.iloc[-n:].index
arch = arch_model(ret, mean='zero', vol='ARCH', p=1).fit(disp='off')
print(arch.summary())

# Extract the conditional volatility (standard deviation)
conditional_volatility = arch.conditional_volatility

# Plot the actual returns and the conditional volatility
plt.figure(figsize=(12, 8))
plt.plot(ret.index, ret, label='Actual Returns')
plt.plot(conditional_volatility.index, conditional_volatility, label='Conditional Volatility', linestyle='--')

plt.title('Actual Returns vs. Conditional Volatility from ARCH(1)')
plt.xlabel('Date')
plt.ylabel('Returns / Volatility')
plt.legend()
plt.show()        

The summary:

Here, you can sees:

  1. Vol Model: "ARCH" indicates that the volatility of the series is being modeled as a function of past error terms.
  2. Distribution: A normal distribution has been assumed for the error terms.
  3. Method: The parameters have been estimated using "Maximum Likelihood," a common method of estimating the parameters of a statistical model.
  4. AIC and BIC: These are information criteria used to compare models - lower values indicate a better model fit, given a balance between goodness of fit and simplicity.
  5. No. Observations: The model has been fitted using 2784 observations.
  6. Df Residuals and Df Model: The degrees of freedom for residuals and the model are given. Here, there are 2784 residuals (which is equal to the number of observations) and 0 degrees of freedom for the model, which might imply that no parameters were estimated, but this is not clear without further context.
  7. Volatility Model: It lists two parameters, omega and alpha[1], with their coefficients (1.2458 for omega and 0.2046 for alpha[1]), standard errors, t-statistics, p-values, and 95% confidence intervals. The p-values are very small, indicating that both coefficients are significantly different from zero at conventional levels of significance.

  • omega: It has a coefficient of 1.2458, which is significant, and the confidence interval does not include zero, further supporting its significance.
  • alpha[1]: The coefficient for alpha[1] is 0.2046, which is also significantly different from zero, indicating that past volatility (squared residuals) has a positive and significant effect on current volatility.

The ARCH models are designed to model volatility, not returns. The significance of the ARCH terms suggests that there is a volatility effect present in the data.

The plot outcome:


So as parameter p states on used lags, let’s check modified code:

bic_arch = []
for p in range(1, 5):
    arch = arch_model(ret, mean='zero', vol='ARCH', p=p).fit(disp='off') 
    bic_arch.append(arch.bic)
    if arch.bic == np.min(bic_arch): 
            best_param = p
arch = arch_model(ret, mean='zero', vol='ARCH', p=best_param).fit(disp='off') 
print(arch.summary())
forecast = arch.forecast(start = split_date[0])
forecast_arch = forecast
rmse_arch = np.sqrt(mse(realized_vol[-n:]/100,
                        np.sqrt(forecast_arch.variance.iloc[-len(split_date):]/100)))
print("The RMSE value of ARCH model is {:.4f}".format(rmse_arch))

plt.figure(figsize=(12, 6))
plt.plot(realized_vol,label='Actual Volatility')
# Extracting the forecasted volatility from the model's forecast
predicted_volatility = np.sqrt(forecast_arch.variance.iloc[-len(split_date):])
# Plotting the predicted volatility
plt.plot(predicted_volatility, label='ARCH Predicted Volatility', color='red', linestyle='-')
plt.title('Actual vs. ARCH Predicted Volatility')
plt.xlabel('Date')
plt.ylabel('Volatility')
plt.legend()
plt.show()        

Here, we loop over lag with the p variable. The idea is to find minimal BIC, as it is a criterion for model selection that balances the model's goodness of fit against its complexity (number of parameters). So, the summary is as follows:

  1. bic_arch: This is a list that collects the BIC values for each model that you fit in your loop. For each p in your loop, you fit an ARCH(p) model and then append its BIC value to this list.
  2. np.min(bic_arch): This function finds the smallest (minimum) BIC value in the bic_arch list. The idea is that among all the models you've considered, the one with the smallest BIC value is preferred because it's seen as providing the best balance between fitting the data well and not being overly complex.

GARCH

The Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model is indeed an extension of the ARCH model, providing a more comprehensive framework for modeling financial time series volatility. This extension is particularly useful because it allows the model to capture longer-term dependencies in volatility than the ARCH model, which only includes the lagged error terms.

GARCH Model Formulation:

A GARCH(p, q) model can be defined as follows:

The parameters p and q represent the order of the GARCH and ARCH components, respectively. The GARCH terms (βj) capture the persistency of volatility shocks, while the ARCH terms (αi) capture the immediate impact of past shocks.

Advantages of GARCH Models:

  • Efficiency: They often require fewer parameters than an ARCH model to capture the same amount of volatility dynamics due to their ARMA-like structure.
  • Flexibility: GARCH models can capture long-run volatility patterns as well as short-run fluctuations.
  • Persistence: They allow for the modeling of volatility persistence, where the effects of past variances on future variances decay more slowly.

garch = arch_model(ret, mean='zero', vol='GARCH', p=1, o=0, q=1).fit(disp='off')
print(garch.summary())

bic_garch = []
for p in range(1, 5):
 for q in range(1, 5):
     garch = arch_model(ret, mean='zero',vol='GARCH', p=p, o=0, q=q).fit(disp='off')
     bic_garch.append(garch.bic)
     if garch.bic == np.min(bic_garch):
         best_param = p, q
garch = arch_model(ret, mean='zero', vol='GARCH',
                p=best_param[0], o=0, q=best_param[1]).fit(disp='off')
print(garch.summary())

forecast = garch.forecast(start=split_date[0])
forecast_garch = forecast
rmse_garch = np.sqrt(mse(realized_vol[-n:] / 100,
                         np.sqrt(forecast_garch \
                                 .variance.iloc[-len(split_date):] / 100)))
print('The RMSE value of GARCH model is {:.6f}'.format(rmse_garch))

plt.figure(figsize=(12, 6))
plt.plot(realized_vol,label='Actual Volatility')
# Extracting the forecasted volatility from the model's forecast
predicted_volatility = np.sqrt(forecast_garch.variance.iloc[-len(split_date):])
# Plotting the predicted volatility
plt.plot(predicted_volatility, label='GARCH Predicted Volatility', color='red', linestyle='-')
plt.title('Actual vs. GARCH Predicted Volatility')
plt.xlabel('Date')
plt.ylabel('Volatility')
plt.legend()
plt.show()        

You see, the difference is that we iterate over two loops for p and q, finding minimal values for garch.bic - like the previous example.

So, the summary is as follows:

and plot:

From the plot, we can see that the GARCH model effectively captured return volatility. It’s due to its recognition of volatility clustering and the leptokurtic nature of financial returns, which are not typically normally distributed.

GARCH GJR

While the standard GARCH model offers a robust framework for modeling financial volatility, it doesn't account for the asymmetric impact of market shocks—typically, negative shocks increase volatility more than positive shocks of the same magnitude. The GJR-GARCH model addresses this limitation by incorporating the leverage effect, offering a more precise and nuanced understanding of market dynamics, especially during turbulent periods. This added precision makes it a compelling choice for enhanced volatility forecasting and risk assessment.

Introduced by Glosten, Jagannathan, and Runkle, the GJR-GARCH model is an extension of the standard GARCH model, specifically designed to account for the leverage effect. It does this by including an additional term that allows the model's volatility equation to react differently to positive and negative shocks. This feature makes the GJR-GARCH model particularly valuable in scenarios where:

  1. Leverage Effect is Evident: In many financial markets, negative news or market downturns lead to greater uncertainty and risk (volatility) than positive news or gains of equivalent magnitude. The GJR-GARCH model's ability to differentiate between these effects can provide a more accurate depiction of market dynamics.
  2. Risk Management Needs: For financial institutions and investors, accurately estimating and managing risk is crucial. The GJR-GARCH model's nuanced approach to modeling volatility can improve risk management strategies and financial decision-making, especially in managing portfolios and pricing derivatives.
  3. Improved Model Fit and Predictive Power: Empirical evidence suggests that incorporating the leverage effect improves the statistical fit of volatility models for many financial time series. This can lead to better predictions of future volatility and more informed investment decisions.

The GJR-GARCH model is particularly useful in risk management and financial derivatives pricing because it more accurately represents the risk dynamics observed in real-world financial markets.

bic_gjr_garch = []
for p in range(1,5):
    gjrgarch = arch_model(ret, mean='zero', p = p, o=1, q=q).fit(disp='off')
    bic_gjr_garch.append(gjrgarch.bic)
    if gjrgarch.bic == np.min(bic_gjr_garch):
        best_param = p,q
gjrgarch = arch_model(ret, mean='zero', p=best_param[0], q=best_param[1], o=1).fit(disp='off')
print(gjrgarch.summary())
forecast = gjrgarch.forecast(start=split_date[0])
forecast_gjrgarch = forecast

rmse_gjr_garch = np.sqrt(mse(realized_vol[-n:]/100,
                             np.sqrt(forecast_garch.variance.iloc[-len(split_date):]/100)))
print("The RMSE value of GJR GARCH model is {:.6f}".format(rmse_gjr_garch))

plt.figure(figsize=(12, 6))
plt.plot(realized_vol,label='Actual Volatility')
# Extracting the forecasted volatility from the model's forecast
predicted_volatility = np.sqrt(forecast_gjrgarch.variance.iloc[-len(split_date):])
# Plotting the predicted volatility
plt.plot(predicted_volatility, label='GARCH GJR Predicted Volatility', color='red', linestyle='-')
plt.title('Actual vs. GARCH GJR Predicted Volatility')
plt.xlabel('Date')
plt.ylabel('Volatility')
plt.legend()
plt.show()        

So, the summary is as follows:

and plot:

and the last one EGARCH (Exponential GARCH):

The EGARCH model, proposed by Nelson in 1991, allows for asymmetry in the volatility response to shocks by modeling the log of the conditional variance rather than the variance itself. This ensures that the conditional variance remains positive.

for p in range(1,5):
    for q in range(1,5):
        egarch = arch_model(ret, mean='zero', vol='EGARCH',
                            p=p, q=q).fit(disp='off')
        bic_egarch.append(egarch.bic)
        if egarch.bic == np.min(bic_egarch):
            best_param = p,q
egarch = arch_model(ret, mean='zero', vol='EGARCH',
                    p=best_param[0], q=best_param[1]).fit(disp='off')
print(egarch.summary())

forecast = egarch.forecast(start=split_date[0])
forecast_egarch = forecast
rmse_egarch = np.sqrt(mse(realized_vol[-n:]/100,
                          np.sqrt(forecast_egarch.variance.iloc[-len(split_date):]/100)))
print("The RMSE value of EFARCH model is {:.6f}".format(rmse_egarch))

plt.figure(figsize=(12, 6))
plt.plot(realized_vol,label='Actual Volatility')
# Extracting the forecasted volatility from the model's forecast
predicted_volatility = np.sqrt(forecast_egarch.variance.iloc[-len(split_date):])
# Plotting the predicted volatility
plt.plot(predicted_volatility, label='EGARCH Predicted Volatility', color='red', linestyle='-')
plt.title('Actual vs. EGARCH Predicted Volatility')
plt.xlabel('Date')
plt.ylabel('Volatility')
plt.legend()
plt.show()        

with the plot:

The main difference in the EGARCH equation is that the logarithm is taken of the variance on the left-hand side of the equation. This indicates the leverage effect, meaning that there exists a negative correlation between past asset returns and volatility.

Summary

Model Selection and Criteria: When modeling financial time series data, it's essential to balance model fit and complexity to prevent overfitting. Criteria like AIC, BIC, and cross-validation are used to select the most appropriate model. For substantial datasets at risk of overfitting, BIC is often preferred due to its stricter penalty on model complexity.

Practical Application: The analysis uses Apple stock returns to demonstrate volatility modeling. The ARCH model is applied first, revealing the volatility dynamics and model parameters' significance. The GARCH model is introduced next, offering a more efficient and flexible approach to capturing longer-term volatility patterns. The GJR-GARCH model extends this by accounting for the leverage effect, reflecting the asymmetric impact of market shocks on volatility. Lastly, the EGARCH model is presented, allowing for asymmetry and ensuring positive conditional variance through the logarithm transformation.

Model Performance and Comparison: The RMSE (Root Mean Square Error) values for each model are calculated, providing a measure of the models' predictive accuracy. Comparing these values helps in assessing which model performs best in forecasting volatility.










要查看或添加评论,请登录

Jakub Polec的更多文章

社区洞察

其他会员也浏览了