Temperature a Statistically Significant Factor to Coronavirus Infection Rate
Coronavirus Dashboard - Photo by Markus Spiske on Unsplash

Temperature a Statistically Significant Factor to Coronavirus Infection Rate

Summary

I conducted an analysis of 140 data-points from 14 cities in mainland China to find a definitive answer to the question of the century: does warmer weather slow the spread of coronavirus?

Data showed only 3 factors were statistically significant at explaining rates of infection.

Together, these 3 factors explain 90% of the variability in infection growth rates.

3 Significant Factors

  1. Time Elapsed Since First Case: a proxy for China's nation wide response measures
  2. Physical Distance to Wuhan: a proxy for non-community spread infections
  3. Recent Daily-Low Temperatures: the average daily-low temperature leading up to the period in question

Key Findings

Warmer daily-low temperatures were statistically significant at reducing expected infection growth rates (p-value ~.009). Interestingly, recent daily-low temperatures were slightly more significant than the alternative of recent daily-high temperatures (in an otherwise equivalent model). This may imply sustained warmer temperatures have a more beneficial effect than shorter-lived spikes in temperature, but is not conclusive.

Findings suggest that the total reduction, from winter to summer infection growth rates, may only be ~20%.

By far, the largest impact on reducing infection rates was time elapsed since outbreak, underlying the effects of China's strong response, including extreme measures for social distancing and quarantine.

Four other factors I examined (population density, recent humidity, recent UV index, and recent chance of rainfall), had no statistically significant impact on infection rates.


Analysis: Detailed

My study looked at infection rates across 14 cities impacted by coronavirus, intentionally excluding Wuhan due to its highly unique circumstances as the origin of the outbreak.

To smooth day-to-day fluctuations, I segmented data into periods of 3-day averages of reported cases, and then analyzed period-to-period logarithmic growth rates.

The total time-frame covered ran from late January (when cases outside Wuhan first began being published), to late February, when China achieved effective containment of the virus (as evidenced by case growth rates across China at or near zero).

China coronavirus growth rates

This yielded 10 unique 3-day periods per each of the 14 cities analyzed, or 140 total data points, each with different realized period growth rates.

I then augmented each period with data for 3 types of factors.

Demographic Factors

Per each city, I looked at:

  • distance from the city to Wuhan in natural log of kilometers, as a proxy for an "external" source of infection growth not explained by community spread
  • population density, in millions of population per square kilometer

Social Response Factors

As seen in the chart above, growth rates have consistently fallen (in a non-linear fashion) since initial reported outbreak data for cities other than Wuhan. This is almost certainly explained by China's swift and drastic response with measures such as social distancing and quarantines.

  • time elapsed from the start of the study in natural log of days elapsed, as a proxy for the impacts of China's measures to treat and contain the virus

Weather Factors

Each date, and each city, obviously had unique local weather patterns leading up to the date in question. To examine the impact of recent weather to each period's growth rate I looked at:

  • daily-low temperature (3 day average prior to day in question)
  • daily-high temperature (3 day average prior to day in question)
  • daily humidity (3 day average prior to day in question)
  • daily UV index (3 day average prior to day in question)
  • daily precipitation (3 day average of chance of rainfall prior to day in question)


Findings: Detailed

Given the amount of data involved and the desire for high model explicability, more complicated machine learning techniques were overkill for this analysis.

A basic regression analysis of the above factors found that only 3 factors were statistically significant: distance to Wuhan, time elapsed, and temperature.

Due to the colinearity of daily-low temperatures and daily-high temperatures, I created two otherwise equivalent models, one using daily-lows as a measurement of temperature, and an alternative using daily-highs.

Coronavirus growth rate regression

Both measures of temperature were found to be significant in their respective models, with p-values of .009 and .029 for daily-lows and daily-highs respectively.

This may possibly indicate sustained warmer temperatures have a more significant effect than temporary spikes in temperature, but is not conclusive.

No other weather factors were statistically significant.

More curiously, population density was also not statistically significant, which may reflect the impact of China's extreme social distancing measures, limitations in the demographic data, or something still not understood about the virus or how infections spread.


Data Sources & Acknowledgements

https://github.com/imantsm/COVID-19

Dr. (PhD) Arnold Moyo

Board Member (Lead Environment) - Green Resources Company, Zimbabwe, Africa

4 年

email me your report on [email protected]

回复
Denis Tsoi

Staff Engineer

4 年

If this is the case, then restricting air travel between north/southern hemispheres would enable us to beat the virus popping up during off-season.

回复
Chad Spencer

The only Recruiter you will ever need??????

4 年

I hope this is true, Matt!

回复
Gav Gillibrand

?? On A Mission To Help Busy, Tired & Stressed Dads Over 40 Drop 30lbs in 12 Weeks & Fall in Love With Their Body 365 Days of The Year. Results GUARANTEED or Be Coached for Free ??

4 年

Thanks for this Matt

回复
回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了