Actuarial Techniques for COVID-19 in Malta: Errors & Uncertainty
A version of this article has been published in the Times of Malta and can be viewed here.
This is the fourth article in a series taking an actuarial perspective of the development of the pandemic in Malta, a small archipelago in the Mediterranean. The first article introduced the potential use of Markov Chains, the second focused exponential functions and the third applied development factors.
The focus of the last article (published 23rd March) was a projected total number of COVID-19 cases in Malta should these follow the pattern of growth as in China, Italy, Spain, Singapore or a constant 25.6% rate of growth. As at end of March, we had 169 cases in Malta which is far lower than any of the five predictions (the lowest was Singapore at 187 expected cases). As at yesterday (7th April), Malta has 293 cases compared to a prediction of 258 if we follow the Singapore development.
Here I explain why these predictions, and others, will be incorrect at this stage and why asking the question "when will the peak be?" is a bit of a silly question to ask.
Random fluctuations
Since no outcome is certain, there is bound to be some fluctuations in daily number of cases. For example, a simple prediction method we could use is that the number of confirmed cases tomorrow is equal to the average over the past five days. That means that tomorrow (Thursday 9th), we should expect 19.4 cases. However it would be very reasonable to expect between 5 to 20 [Comment added on Thursday 9th: We actually had 38 new cases, proving further the volatility in results].
In reality that number can fluctuate a fair bit due to the natural randomness. The variability is very high as we have seen from 6 to 52 cases per day over the past five days. In actuarial parlance this is called 'process error'.
External Factors
Any random fluctuations are bound to be affected by external factors. For example lock-downs, use of masks (as in Czech Republic below) and closure of schools should all affect the growth rate of the pandemic at a varying rate of success. Even the weather could have an effect - either directly (as there might be evidence of less COVID-19 spread in high temperature and high humidity) or indirectly as many may decided to go for a picnic if the weather is beautiful and thus increasing contagion.
Model Error
Pandemics tend to spread exponentially at first and then their rate of growth calms down, at which point the peak would have been reached. In most cases, an equation (usually one that looks like an S like a logistic or Gompertz curve) or a model (like an SIR differential equation) is developed to explain and predict the number of future cases.
The problem is that the wrong model or equation might be used. Going back to the example of predicting tomorrow's number of cases as the average from the last five days - are we sure this is a good model, if at all? Maybe we should use a proportion of growth? What if we just randomly pick a number instead? All of these are models.
The majority of models being applied assume a peak at some point, however there may be a possible second wave. A logistic function would have suited very well to Singapore's number of cases in the first 20 days, probably projecting a total of less than 500 cases - yet there are now over a thousand confirmed cases to date.
Many of us experts tend to get over excited by fitting a model without explaining (or understanding) its limitation and it is very possible to fit the wrong model. Anyone who is interested in reading a deeper discussion of this - please look for discussions on Ersatz Models.
Calibration / Parameter Error
Even if we chose the correct model, many models require some sort of calibration - a parameter. For example if we assume a percentile growth, we need to calculate that percentile. Should we use all days or the percentile growth over the past few days only? Or maybe we should benchmark this percentage by experience in another country? In any case we should test how our final predictions may fluctuate should we change these parameters - that is how sensitive is the model output to these small calibrations. My earlier predictions all gave a best-estimate range as I was not certain of the parameters used within the models. For example on 19th March, I predicted 138 COVID-19 cases by 23rd March but with a best estimate range of 93 to 208 cases. The best-estimate ranges produced tended to be wider at the pessimistic range (208 is 70 more cases than 138 when compared to 93 which is 45 less). This is due to the skewness of the expected parameters and partially because down-side risk is higher than up-side risk (in simple words, there is more likelihood of things to go worse than better). The actual number of cumulative cases on the day were 107.
Conclusion
That means that we cannot use models to predict COVID-19 cases in Malta? No, it does not mean that we cannot use models. However it means that we need to appreciate their limitations and how they help us to explain different projections.
The role of the actuary, or any expert, is not simply to fit a model but to explain the deviation of real life from that model and how to deal with it.
Want to learn about FLOW? Take my FREE FlowBooster course!
4 年Great considerations. And great that you linked to that video about masks. What is not being highlighted enough in these debates about masks, is that the important effect is not about individual protection, but about cumulative reduction of infection rates on an exponential curve. Would you agree that even a minor reduction in infection rate, with compulsory usage of masks, would have a major impact on "buying time" so that the entire health system can better cope with the demand that is being put on it? Could you quantify that?