Do Numbers Lie? The Curious Case of Coronavirus
Philadelphia Testing Site Image by The Hill

Do Numbers Lie? The Curious Case of Coronavirus

In this article I talk about my thoughts on Covid-19(I’m not a health expert and definitely don’t want to pretend to be one). This rather takes a closer look at Covid-19’s metrics in my POV to explain why “Numbers when not properly transformed into metrics can lie” . Stay with me for the next couple of minutes and I’ll elaborate:

1: Choosing the right metrics to follow

The Covid-19 cases in California at the time of writing this article are 53,606 and the cases in Pennsylvania are 50,915. That is a stat that makes me go “wow”, does this mean Pennsylvania can re-open faster than California? That simple?(Less number of cases is now viewed as an optimistic metric to reopen states by some)

Now, throwing in a denominator of the number of tests administered, gives the % positive outcome: CA-8.18% and PA- 21.6%. Does this mean California is safer? The answer is indecisive on just this information and is probably a bit more complicated to arrive at as you’ll have to know the robustness of the testing program(method by The Covid tracking project) or look at the case growth rate in these respective states which talks about doubling period. (method by 19-divoc). Some of these methods might get you the answer you are looking for but not the hard data-53K and 50K cases. The point being- Choosing the right metric to the question you asked makes a world’s difference. 

No alt text provided for this image

Figure1.1 shows infection growth rate in CA leveling-off (Y-axis:Log scale on #cases, X-axis: days since 100th case)

2: Leading & Lagging Indicator

When we all first learnt about the two major metrics out there- ie., deaths and number of cases. I didn’t know what to make out of each of these. Let me explain:

If the deaths are really low but cases are on the rise- Should I say I’m probably fine (and I’m not at risk?)If the number of deaths are picking-up and the number of cases are on drop- Should we impose stricter measures? While death is definitely more emotional out-come than a new infection, it gives learning for tomorrow but doesn't tell where you are today. It is a lagging indicator. Here the number of new cases is the leading indicator. Understanding and identifying the leading and lagging helps us to make effective decisions. ie., How fatal is this disease?(insights from lagging indicator) How are my social distancing measures working?(insights from leading indicator).The right identification and utilization of these indicators is the difference between life and death!(Quite literally). 

No alt text provided for this image

Fig2.1 Flatten the curve graph shows # of cases plotted against time. (#cases is an actionable leading indicator)

3: Data Integrity

The mother of all is data integrity. While this is 3rd in my list, this should be the first priority for any metric or data collection process. What is data integrity? Simply put- Data integrity is capturing all data with accuracy, consistency and completeness. Be it identification by testing or by symptoms or any other tool out there. In Covid-19, this caused much of a frustration as we were not able to capture the number of cases in entirety(accuracy, completeness) and the changing definitions(consistency) which resulted in fatality rates all over the map (1% to 11%). 

And not just fatality rate but all the metrics based on the number of cases are heavily skewed depending on the data integrity. This makes decision making skewed or flawed resulting in loss of lives. On the flip side some countries like Taiwan did emerge victorious largely due to high data integrity paired with early decision making.  

4: Visualizations

Choosing the right visualization is also important(for insights and decision making) which we kinda got right(as this was totally controlled by experts, phew!)I won’t dig into this as this vox video explains about visualizations in a perfect way and why Infectious diseases warrants a log scale instead of linear scales(and the subsequent drawbacks of choosing one over other): VOX 

No alt text provided for this image

Fig4.1 Vox explains different messages conveyed by the same data using different graphs 

While Covid-19 is a very unfortunate reality we are living through, it has nevertheless proved to be an interesting case-study for the importance of metrics and why we need to get them all right. While the axiom “Numbers don’t lie” is still largely accepted, measuring these numbers and transforming them into metrics is the make or break deal and the metrics do lie when enough caution is not applied!  


What are your thoughts on this article? Did you find a gap in your organization's metrics similar to Covid-19's metrics? What best practices do you adopt as an engineer/leader to ensure they are actionable and give you desirable outcomes? Let me know in the comments or DM me.




References:

1-https://www.theverge.com/2020/4/2/21201832/novel-coronavirus-covid-19-best-graphs-tracking-data

2-https://www.npr.org/sections/health-shots/2020/03/16/816707182/map-tracking-the-spread-of-the-coronavirus-in-the-u-s#states

3-https://www.npr.org/sections/coronavirus-live-updates/2020/04/14/834431383/taiwan-reports-no-new-coronavirus-cases-adding-to-success-in-fighting-pandemic

4- https://www.bbc.com/future/article/20200401-coronavirus-why-death-and-mortality-rates-differ

5- https://www.vox.com/videos/2020/4/28/21238769/coronavirus-covid19-chart-data-misleading

6- https://www.nytimes.com/article/flatten-curve-coronavirus.html

7-https://www.inquirer.com/health/coronavirus/coronavirus-testing-sites-philadelphia-main-line-south-jersey-insurance-20200324.html







       

Bhavana A.

Alumni: Rivian, Amazon, WESCO, Infosys, Arizona State University, Andhra University

4 年

Great work

回复

要查看或添加评论,请登录

Bharath Kurapati的更多文章

社区洞察

其他会员也浏览了