Making sense of the biggest lockdown of the world with numbers
Kunal Mehta
Global Data Platform Head | Product Owner | Associate Director | Data Analytics | Google Analytics | Adobe Analytics | Google Cloud Platform | Machine Learning | Data Science | Data Engineering | Speaker
The purpose of the article is 3 folds, 1) To understand the COVID-19 impact in India, in terms of number of cases and fatalities through data, 2) To see if there is any mathematical pattern in the numbers, and see if pattern can be followed to extrapolate in future, 3) To understand the measure the impact of lockdown, and what could have been had there been no lockdown in India.
The world has changed, and it has changed for a long time. An extremely tiny, though not as insignificant, something, virus, that hovers on the boundary of living and non-living, has pretty much brought the mighty world of it’s knees. For close to 60 days, whole of India is in lockdown, so is pretty much every major country of the world. Lockdown means that everything, OK almost everything, is closed. This means that there has been a substantial loss to economy, to GDP, to production, to almost every measurable metric out there, including life. Due to this scenario we are right now hearing a clamor to open up the world, the economy, from different sections of the societies.
Infact some even question the very exercise of the lockdown, they are asking if the whole exercise of lockdown has actually achieved something? Is it the right time to open up the world, or maybe partially open up the world.
In this article, I am not trying to take sides. I am just trying to showcase what lockdown in a country like India has achieved, in terms of number of cases, number of casualties, if there had been any positive impact, and further, what exactly can we expect once we open up the economy, and allow flow of people, goods and transportation.
So, let’s get some facts right at the beginning:
India’s CVID-19 cases: 91, 314*
Deaths: 2,897*
Fatality Rate: 3.19*% (It has been close to 3% for a while now)
Recovered cases: 34,581*
Fig 1: India's COVID cases in a linear chart
Fig 2: India's COVID cases in a logarithmic chart
Why log scale is important?
Logarithmic scale actually gives a better perspective of two things:
- If there are really small and really big numbers on the same graph, it’s really difficult to see those smaller numbers. For example, in current scenario, when initial numbers were 1, 2, 10, 30, etc. they will not even be seen now since current numbers are close to 100,000. So, in case of a linear graph (fig1), it seems like numbers suddenly jumped, whereas this was not the case, as we can clearly see form logarithmic graph here, that numbers actually jumped in initial stages, but now the growth rate seems to have tapered down.
- Whenever we want to observe the rate of change, again logarithmic charts come in real handy. In case the rate of change is constant, the linear chart can create real panic, since it shows the compounding effect, whereas logarithmic chart shows it much more clearly. Again, referring to fig 1, in the initial days, when India imposed lockdown, India’s doubling rate of COVID cases had actually come down to 3 days. Right now, that is on May 17, 2020, is standing at close to 17 days. This massive improvement in the current scenario can be easily lost if you see figure 1, and just into the numbers on linear scale, figure 2, that is the logarithmic chart, again shows you the tapered down chart showcasing how the doubling rate is actually coming down.
Now that we have seen what it looks like right now, let’s see what we are actually aiming for.
Fig 3: Total cases in South Korea, in logarithmic chart
This is South Korea’s chart, which clearly shows perfectly flattened curve on the logarithmic scale, but let’s look at it’s linear scale as well:
fig4: Total cases in South Korea, in linear chart
As this chart shows, we will expect to see the flattening of linear graph as well, but much before linear graph gets flattened, the logarithmic graph will flatten. And once both these graphs flatten out, this will happen with the active cases
Fig 5: Active cases in South Korea, linear chart
As we see from figure 1 and figure 2, that we are quite a distance from the ideal case that we saw in South Korea, so the question that you might ask is, if lockdown actually prevented anything? Did we really achieve anything?
To understand what we achieved with lockdown, we have to understand what if there was no lockdown? To understand the impact, I did a couple of things:
- I took all data wrt to dates, since January 30 from MoHF website and tried to create a statistical model to see if our mathematics can actually come close to unravel the pattern, and see if we can predict what can we expect next?
- I also created a model on the data till March 23, 2020, and see how the patterns looked like, if we can come up with a close enough mathematical equation and then can we use the same to understand how many cases would have been had there been no lockdown?
To address the part 1, I first created a model by applying polynomial regression on top of time series to come up with an equation which has the Rsqr of 0.9993. Yep, I know that it’s unreal, but that’s what it is. To further validate, I tested for last 3 days and came up with below numbers.
India present COVID Numbers*
What it means is that the equation that I came up was pretty solid, not only the Rsqrd value was pretty unreal, but the accuracy was actually very high as well.
Now once I knew that mathematics can actually help us ascertain the facts with such amazing accuracy, I addressed the part 2 of the question, again by applying polynomial regression to time series data, and came up with the Rsqr of 0.9962. Once the numbers were calculated for 21st, 22nd, and 23rd March, I came up with the following numbers:
Which means that this model was as accurate, and was reliable enough for us to extrapolate and see what could have been today, had there been no lockdown. And these are the numbers that we saw:
This clearly shows how impactful the lockdown has been. The number of cases which would have been hovering around 160k in India, are close to 91k, that is down by 75% of what it could have been. Now these are infections, but how about casualties? Like I mentioned earlier, that India has been lucky in keeping the casualties around 3% for some time now, it means that number of deaths that are at 2,897, would have been around 4,800, that is again down by 65% of what it could have been.
But number of lives can’t be expressed in mere numbers and percentages, it could have meant 1900 more deaths, 1900 more families getting impacted, probably 10,000 lives getting impacted in a way that nothing would have brought them back to ‘normal’.
If there’s still someone who wants to measure the impact of lives in terms of hard numbers, money, India’s GDP per capita is currently at $2,144, and for 2000 lives that were saved, this comes out to be $4,287,892, and by current currency conversion rate it comes out to be INR 32,53,52,381. This is just the number of lives, and now add the number of cases, because even that would impact GDP per capita, and you might be able to understand the impact that lockdown had in last 60 days.
What can refine it further?
I understand that all these cases aren’t really monolith, there are various clusters out there, especially in different states. To have a better understanding of data, and quantify the impact further, we can start modeling at state level, at various cluster level. For these clusters, maybe it will be better to use SVR or Decision Tree algorithms, rather than polynomial regression.
How can you use it in your business?
I would highly recommend using the similar analysis on your business numbers, that is number of orders, revenue etc. to understand how lockdown and opening of lockdown will impact your business metrics. What could be the demand of your products, that too on varied platforms and channels, to ensure your supply chain. Also, going a little more granular, at geographic level, you can better understand the impact at every store, for every product, for every category.
Data is power, and in these times, to survive, and maybe even come out stronger, you will need to harness the power of data to prepare yourself better for days coming ahead. I wish you all godspeed in the journey that is coming ahead of us.
*Data for May 17, 2020
Sources:
1) https://www.worldometers.info/coronavirus/#countries
2) MoHF website
BIOSIMILARS | ONCOLOGY | STRATEGIST
4 年Impressive explanation!!
Vice President and Offshore Delivery Partner Nordics CPR (Consumer Products & Retail) at Capgemini
4 年Great insights and thought process backed by your solid data analysis. Thanks for writing this article kunal.
On Road To Becoming A Performance Marketer || Ex-Lava Mobiles || Ex-Reliance Entertainment || Ex-Convergys
4 年Kunal Mehta, Sir. Great article. waiting for more.
Other data enthusiasts - "Its tough to explain your analysis and the real power of data without jargons.." Kunal - "Hold my beer..."