Is COVID-19 Deadlier Than We Thought?

Is COVID-19 Deadlier Than We Thought?

Pandemics have been the plot in a fair share of science fiction, but living through COVID-19 has proven the old adage true: it's really stranger than fiction. From the toilet paper hoarding to the stark contrast of beliefs about the virus, many are unsure whether this is all an overblown flu or whether it might be the end of the world as we know it. And, the lack of solid data, coupled with a barrage of misinformation, makes it difficult to know what we should believe.

So, here's the deal...this article is going to look at data and leave the speculation elsewhere. If you're looking for an answer to whether China's numbers are accurate, you won't find it here. And, if you're looking for any grand claims or outlandish forecasts, you're also in the wrong place.

But, if you want a real look at how this novel coronavirus has progressed based on the most accurate, available data, you're in the right place.

About the Data

The data used here comes from the European Centre for Disease Prevention and Control. It was made available courtesy of Our World in Data, where anyone can download the latest .csv file. This .csv includes daily counts of cases and deaths alongside total counts for these two. If you're interested in studying the data more closely, I encourage you to download the latest COVID-19 source data.

There are no modifications made to this data.

Instead, I look at various countries, time periods, and metrics, visualizing this information using PyCharm, a Python IDE (integrated development environment) for data science. In simple terms, I make the data easy to understand.

Starting with Basics

While I have been tracking COVID-19 since early February, I only started a more detailed analysis of this data in PyCharm on March 15, 2020. With that said, I have been tracking its progress over the hardest-hit countries, focused primarily on the overall number of cases and the case fatality rate. And, I have data available going back to the beginning.

For the purpose of this article, I am going to focus 100% of my attention on the global data -we can visit individual countries on their own in future articles. And, like I mentioned at the start, I will NOT be putting a spin or predicting a final outcome -I'm not a psychic, I'm a scientist.

With that said, let's review what I'm most interested in: total cases, and case fatality rate. Total cases are pretty clear-cut. How many confirmed cases are being recorded in the data? Since we're using data from the European Centre for Disease Prevention and Control, we have what I believe to be the most accurate data available. But, what about total deaths?

Tracking the Death Toll of Covid-19

Total deaths is a difficult number. Again, I am using the most accurate data available, but looking at deaths as a raw number doesn't give a complete picture. Instead, I use the "case fatality rate" as my preferred "death measurement." No, this doesn't change the number. Instead, it looks at a ratio. Specifically, out of the confirmed cases, this considers what percentage of people have died.

Please note, and this is important, the case fatality rate (CFR) is a fluid number. It's not absolute truth. In fact, science never gives absolute truth. That would contradict the entire scientific method. Instead, the CFR looks at AVAILABLE data.

Yes, I am going to break a rule of "good writing" and repeat that again: we can only look at available data.

In other words, the CFR is the "best information we have" but it's undoubtedly not going to represent the entire population. Remember, the data we have is a sample. So, don't jump into the easy trap and assume it's on a set path. This number can and will change as more cases are discovered.

With that disclaimer, let's look at the numbers.

How Deadly is the Novel Coronavirus?

Since my goal is not to spread panic, I am including one final warning: this data looks scary. But, it doesn't have to be. Remember, it's not final.

global CFR over time

So, let's dissect this chart a bit.

I started on January 30th, 2020. I did this primarily for the fact that the CFR was somewhat stabilizing at this point in time. However, I don't want to mislead anyone. The reason I don't go further back is simple: our data overall only begins January 1st and up until the 13th there is either 0 reported deaths or a flat CFR. After that, the rate went as high as 3%, but based on the small number of cases, it was rising and dropping too quickly -it hadn't hit its stride.

Now, back on this chart. It pretty much speaks for itself, but let me fill in some details.

The first date, 1/30/20, has a 2.2% CFR. And, it fluctuates for some time around this 2% level. Specifically, the continuous upper trend doesn't begin until 2/14/20. On this date, the rate continues to rise for two straight weeks without break.

March brings some slowdown on this front -at least for the first 3 weeks. Over that time however, it still increases from 3.4% to 4%. And, from 3/23/20 through the time of posting on 4/2/20, the rate has continued to climb to today's 5%. Or, if you want to be a bit more specific, 5.05%.

Today, 4/2/20, also marks the largest increase in new cases. To keep this in perspective, here's the TOTAL number of confirmed cases over the last week. I'll do the math for you: 

2020-03-26    468049 - An increase of 51204 over the previous day. CFR: 4.48%

2020-03-27    527767 - An increase of 59718 over the previous day. CFR: 4.48%

2020-03-28    591704 - An increase of 63937 over the previous day. CFR: 4.56%

2020-03-29    656866 - An increase of 65162 over the previous day. CFR: 4.64%

2020-03-30    715353 - An increase of 58487 over the previous day. CFR: 4.69%

2020-03-31    777797 - An increase of 62444 over the previous day. CFR: 4.79%

2020-04-01    851309 - An increase of 73512 over the previous day. CFR: 4.92%

2020-04-02    928437 - An increase of 77128 over the previous day. CFR: 5.05%

Sticking to the available data, it's essential you remember a couple of things. First, CFR is based off the total number of deaths vs. the total number of confirmed cases. So, this increase in percentage is not just some blip on the radar. In other words, if we look at 4/2/20, we are NOT just saying there's a 5.05% CFR among the 77128 new confirmed cases.

Instead, we are looking at a 5.05% OVERALL CFR among all 928,437 reported cases. In other words, if that 5% were to flatten out, we'd be looking at 50,000 deaths per 1,000,000 infections.

HOLD ON THOUGH.

I am not saying that will happen. I am still hopeful the CFR will begin to wane as this progresses in April. But, let's not kid ourselves either. When making decisions, we can't go solely off what "might be." We need to go off data.

Is the Coronavirus Deadlier than Originally Thought?

The data says...maybe.

As I've been stressing throughout this whole ordeal (I don't just mean this article, but rather the last couple months overall), we only have so much data.

For scientists, data is everything. And frankly, I'm appalled at how information is being presented through the media.

You're a smart person who can think for yourself -don't let anyone tell you otherwise. But, instead of presenting information as I have here, most of these "experts" are out making wild claims one way or the other. Stop listening to them.

Start looking at the facts of the matter. The only available facts: data like I am showing you here.

Remember, I'm NOT just pulling numbers out of a hat or making wild predictions. I use the data from European Centre for Disease Prevention and Control of confirmed cases and confirmed deaths from COVID-19.

I use it because, out of available data, it's what I believe is best. And, that might mean nothing to you, which is fine. I'm not alone in believing this data is best, but if you trust some other legitimate data source more, look towards that data (please, don't just rely on these "real-time" trackers though).

If you do believe the source data is as good as it gets (like I do), then let's be real about these numbers though.

The death rate could end up being higher than we originally anticipated. Yes, this could still end up having a an overall fatality rate closer to 1% when this is all said and done and we properly record all the presently unreported cases. Or, the CFR could continue rising -no one really knows.

What do we know?

Human lives are at stake.

I don't know about you, but when those are the stakes, I'll always act on the side of caution.

Yes, the current 5.05% is not going to be representative of the whole population. But I'll be damned if I don't make decisions as if it is.

Why? Because that's what our current data is showing.

Not only that, but as we get a larger sample size, we've seen that number rising. Statistics 101 guys...a larger sample size means more accurate metrics. I.E. you can flip 10 coins and get 10 heads. But, as you flip 1000, 10000, 100000 or more, you're more likely to get to that 50/50 ratio.

The same holds true for any data -the more you collect, the more accurate it becomes.

What about Missing Data?

Here's where things get wonky.

We don't really know how much data is missing.

Some believe China has lied about its numbers from the beginning. Maybe they have.

And, some point out there is massive under-testing in many areas. Maybe there is.

Again though, remember that rule about data: the more we confirm, the more we can trust it. So, missing data matters, but what are we supposed to do about it? For now, we can't even say for sure how much data is missing. We can theorize and predict all we want, but since we're playing with an incomplete data set, the end result could look much different than we think.

Since this isn't meant as a lesson on data science, here's my briefest possible explanation on this subject: missing data is bad for any analysis, but since we can't account for or in any way accurate "fill the gaps" at present, we have to use what we have available.

So, What Am I Recommending?

OK, I said I wasn't going to put my own spin on it. And, let's be clear -this is a recommendation, not a prediction.

Limit your exposure to others.

Have loved ones with risk factors? Make sure they are taken care of and make sure they stay inside. In fact, you should downright quarantine yourself if you're in a "risky group." Not sure if this includes you? Check out the CDC COVID-19 Risk FAQ.

There's still so much disagreement on what is best. Even the so-called "experts" don't seem to have a solid consensus. But, pretty much all of them agree precautions are necessary.

So, here's my simple recommendation. Don't expose yourself to anyone you don't have to. Don't shop if you don't have to. Don't go anywhere you don't have to. And, when you think you might be in contact with any contagion, wash your damn hands!

This is a radical proposal. And, I wish the government didn't have to get involved in it -I wish we, as individuals, would take these precautions. If we were careful and avoided exposing those most vulnerable, I do believe there could be a reality where the CFR is much lower. But, let's stay in this reality.

Easy for a Hermit to Say

Look, I get it. What I'm suggesting is easy for me -I'm already a hermit.

I live in my home with my wife and two cats. I work from home. I rarely go out for dinner or to the movies. Even during normal times, we might only leave the house once or twice in a week. And, for introverts like us, it's easy to live that way.

I'm neither well-informed enough nor interested in exploring the plethora of problems this recommendation presents however. I don't just mean for the economy or for individuals whose jobs are gone -I mean for everyone psychologically (actually, I might meander into the economics of it in the future).

But, I imagine we'll have many smart people exploring these factors over the next decade at least. I'll be happy to weigh in on data or offer my own understanding where I can, but that's not for us here.

Instead, I hope the data presented above gives you a picture of WHY it's important we do something.

Moving Forward Together

Look, I want to stick to my initial promise to keep this focused on the data, so I won't make any specific predictions. But, I do want to say one final thing as I finish writing this.

I believe the world will be forever changed by this pandemic. And, I don't know if it will be for the better. However, I do believe that we can focus on potential positive outcomes rather than dwell on what might be lost.

The future always brings opportunities if we know where to look. And while nothing can make up for the loss of a loved one, I implore everyone to take a moment and really consider the fragility of what we have here.

Let's find a way forward, no matter what might come our way.

Raw Data/Source Code

For a look at how this was prepared, please feel free to visit my COVID-19 Github Repository. This is definitely not complete -I will be updating in the future to clean up the code, perform additional calculations, and add docstrings for the functions. But, for now I am just including in case anyone is curious.

要查看或添加评论,请登录

Joseph Macolino的更多文章

社区洞察