COVID-19 Data – KEEP IT SIMPLE
Dr Tony Burns
Q-Skills3D Interactive learning in Continual Improvement for all employees
There have been many posts attempting to draw control charts over COVID-19 data. Others try to fit polynomials to the data. I have even seen a Fourier analysis of COVID-19 data. Today, the computer has become an instrument of data torture. Rather than study the data, folk want to manipulate it.
Disease data is neither homogeneous nor independent. We know that people are infected by other people. The data each day depends on the data the day before. We could use a control chart as a test for homogeneity, but it is of no help. https://www.qualitydigest.com/inside/operations-column/homogeneity-charts-090616.html
“If in doubt, start with a running record and histogram. Proceed only if additional analysis is needed.” - Dr Wheeler.
Sadly, these days most quality folk feel the running record, using run and bar charts is passé. You rarely see a histogram without it being hidden by a meaningless normal distribution drawn over it. The importance of histograms is now seldom taught.
“However, some data actually can be taken at face value, and all we need to do is to place it in context with a simple running record to understand what it is telling us. Some people are afraid of clarity because they fear that it may not seem profound.” – Dr Wheeler
We investigate the running record, the bar chart. We see that deaths (and new cases) increased rapidly and are now slowly decreasing.
“But what about all that noise in the data” you say.
We examine the “noise” and discover it isn’t noise at all. There is a low data point exactly every 7 days! Are fewer people dying on weekends? Is there better care on weekends? Is there worse reporting on weekends?
We also discover that there are fewer new cases on weekends. Are there fewer sick people reporting their illness on weekends? Is there less testing on weekends?
We need to investigate, using the data.
Examining Covid-19 data is a great example of how people try to complicate things and miss the message in the data. KEEP IT SIMPLE. If you are trying to use more than Professor Ishikawa’s tools, you are probably getting Quality wrong.
Q-Skills3D Interactive learning in Continual Improvement for all employees
4 年CDC latest. “with deaths involving coronavirus disease 2019 (COVID-19). For 6% of the deaths, COVID-19 was the only cause mentioned.” https://www.cdc.gov/nchs/nvss/vsrr/covid_weekly/index.htm
System Engineering Management Consultant (Ret)
4 年Interesting that just a quick, not real time comparison, with the similar chart of new cases by day over the same time period shows the same low point on friday and saturday. The new cases chart shows a peak was reached on or about 5 April, two weeks before the peak in this death chart on or about 15 April. My guess on the dips would be that new cases by tests in commercial labs don't get done or reported on the weekends and tests in hospitals tend to be held till monday when the office crew comes back to work. Also, given the increase of testing starting in late April i suspect we are seeing more asymptomatic cases bringing the latter part of the chart numbers up over the earlier ones when only real reported cases were recorded. As you say, simple histograms often tell you more than charts where analysts try to find complication in simple data. Go where the useful information is and where you can look at cause and effect.
Practitioner & Promoter of the Shewhart/Deming Management Method (SPC/14 Points) and the System of Profound Knowledge
4 年Dr. Burns, agree. And when it comes to data, garbage in, garbage out.
Q-Skills3D Interactive learning in Continual Improvement for all employees
4 年A strange run chart. Northeastern U.S. metropolitan area RED: smoothed primary sewage sludge SARS-CoV-2 virus RNA concentration BLACK: smoothed COVID-19 epidemiology curve (new cases) https://www.medrxiv.org/content/10.1101/2020.05.19.20105999v1.full.pdf?fbclid=IwAR1341tlA99093z6JAAiUDZsJQNVZrxHcqeb9LWyxRM7Uz3gw7Sr_yPl7tM