Brief Analysis of specific questions on the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database

Brief Analysis of specific questions on the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database

Ramon De L’Hotellerie - Reproducible Research

July 20, 2020

Synopsis

Storms and other severe weather events can cause both public health and economic problems for communities and municipalities. Many severe events can result in fatalities, injuries, and property damage, and preventing such outcomes to the extent possible is a key concern.

This project involves exploring the U.S. National Oceanic and Atmospheric Administration’s (NOAA) storm database. This database tracks characteristics of major storms and weather events in the United States, including when and where they occur, as well as estimates of any fatalities, injuries, and property damage.

Data Processing

The data for this assignment comes in the form of a comma-separated-value file compressed via the bzip2 algorithm to reduce its size. You can download the file from the course web site:

There is also some documentation of the database available. Here you will find how some of the variables are constructed/defined.

The events in the database start in the year 1950 and end in November 2011. In the earlier years of the database, there are generally fewer events recorded, most likely due to a lack of good records. More recent years should be considered more complete.

So I start by calling the necessary libraries, downloading the data to be analyzed and checking the first lines.

No alt text provided for this image
No alt text provided for this image
No alt text provided for this image
No alt text provided for this image
No alt text provided for this image

A quick analysis of the top of the data framework shows that the data set has 902297 rows and 37 columns.

In the analysis, the following columns will be used in order to form a new data frame by using the following variables:

  • EVTYPE
  • FATALITIES
  • INJURIES
  • PROPDMG
  • PROPDMGEXP
  • CROPDMG
  • CROPDMGEXP
  • STATE
No alt text provided for this image

Finding, across the United States, the most harmful events with respect to population health

In order to be able to find the most harmful events, the total fatalities and the total injuries for each event type are calculated. First, I start calculating the ten most harmful events that caused fatalities and the total of fatalities per event.

No alt text provided for this image

Then, I calculate the ten most harmful events that caused injuries and the total of injuries per event.

No alt text provided for this image

Finally, I just sum up both, fatalities and injuries per event, together and visually we can see the order of most harmful events across the United States.

No alt text provided for this image

Let’s now visualize how big is the difference between the most harmful events across the United States

No alt text provided for this image
No alt text provided for this image

The conclusion would be that the most harmful event, by far, in terms of the number of casualties across the United States is the Tornado with 96,979 total casualties.

Finding, across the United States, the types of events with the greatest economic consequences

In order to be able to find the types of events causing the greatest economic consequences, I take the calculations of Property Damage Expense and Crop Damage Expenses.

Before any calculation, some data preparation is needed. The damage variable exists of two sets, one alphanumeric (“-DMGEXP”) and one numeric (“-DMG”), therefore, in order to be able to sum both sets as one, converting alphanumerical values into numerical values will be necessary. After being able to calculate the sums with numerical sets, then a new variable will be created, which will add both Crop and Property damage. Let’s start checking the types of values.

So I found different types of units, which need to be converted into dollars:

- K / k is thousand dollars - M / m are million dollars - B / b are billion dollars First, I convert the Property Damage expenses into dollars and print the 5 most current values:

No alt text provided for this image
No alt text provided for this image

Now that we have the list in decreasing order of the most harmful events in terms of damage to property and crop, let’s put on a visual way.

No alt text provided for this image
No alt text provided for this image

Results

So according to my analysis of the given data, the two answer to the research questions are:

  1. Across the United States, which types of events are most harmful with respect to population health? - The 5 most relevant harmful events with respect to population health are Tornado, Excessive Heat, TSTM Wind, Flood and Lightning in this order.
  2. Across the United States, which types of events have the greatest economic consequences? - The 5 most relevant events causing the greatest economic damage are Flood, Hurrican/Typhoon, Tornado, Storm Surge and Hail in this order, for both Property and Crop Damage.

In other words, and according to these figures, Tornados and Flood seem to be the most destructive events in all senses across the United States.

要查看或添加评论,请登录

Ramón De L'Hotellerie de Fallois的更多文章

社区洞察

其他会员也浏览了