Does where we workout matter? Can we use code to live healthier? An exploration using Foursquare API and folium
The Burning Question
Like many of you, I ?? food. Always have, and I’m thankful that it comes in so many wonderful flavors, shapes, and sizes! I’ve had a love-hate relationship with Fast Food over the years. I mean, who can resist a tasty burger?
Yet, while my mind enjoyed the idea of trying out a new special that was advertised, my body would oblige silently without resistance for a while until it got to a point where it couldn’t remain silent anymore.
The US is the birthplace of Fast Food. Growing up in the Eastern part of the world, I had no idea how in-your-face it was until I landed on US soil. In 2012 alone, the industry spent?$4.6?B-like-boyz-n-the-hood?Billion dollars on advertising their products to you. That was 10 years ago. Given the growth we’re seeing, one can assume that those numbers may be considered conservative today.
Don’t get me wrong, I still enjoy a good Fast Food meal from time to time but over the years, I’ve learned to be a bit more conscious and caring to my body.
Which got me thinking…?How can anyone who lives anywhere in today’s advertisement filled world, especially in the US, stay healthy??
Sounds impossible — unless, you put yourself through ridiculous amounts of self-determination or go live in a remote island like Tom Hanks.
You walk out the door and there are billboards and temptations everywhere! Just trying to draw you in with well-crafted words by the marketing teams of these mammoth corporations. Sure, you can resist for a while but imagine waking up everyday to this.
Even worse is when you try to give into the health craze. You decide to get fit and you head to workout at a local gym. You leave the gym after an intense session. You get in your car and start to head home and then you see this..
You decide to give in, because you deserve it after going through all that pain. However, little do you realize, that what you just consumed not only exceeds all the work you just put in, but now, has made you unhealthier as well.
The Thought
All this got me thinking. What if I could navigate better in this temptation filled world?
It’s not realistic to think that I can avoid Fast Food advertising completely, but what if where I choose to live, more specifically — where I choose to workout, has an implication on if I give into cravings or not afterwards.
Enter Foursquare API & Folium.
The Foursquare API is an independent and global, location based platform that collects user-generated information via their app and other sources for public points of interest into a streamlined database. Developers can access this data by creating an account on their platform.
I used the API to return public venue information based on geospatial data (latitudes, longitudes).
folium?builds on the data wrangling strengths of the Python ecosystem and the mapping strengths of the?leaflet.js?library. Manipulate your data in Python, then visualize it in on a Leaflet map via?folium.1
Case Study
Anna is a 24 year-old from Raleigh, NC who works in healthcare. She enjoys the East Coast life but is considering a shift to the West Coast for the next step in her career and a curiosity to experience life elsewhere. Anna, like many of us, has struggled with the temptations of advertising over the years and wants to live a healthier lifestyle.
She doesn’t know much about where would be a good Neighborhood to move to in L.A. She comes to us for help regarding this matter.
Our goal is to find the ideal neighborhood areas that would foster healthy behaviors.
Data Sources:
In addition to the Public Venue data that we would obtain from the Foursquare API, we need more quantifiable information to build our solutions visually.
That was all a mouthful. Ok, let’s load it all in.
Data Imports
To import the Neighborhood List from L.A. Times, we will be using the BeautifulSoup?package which has the ability to pull data out of HTML and XML files.
Much of the time spent working with Data revolves around cleaning messy data and ensuring that it’s ready for manipulation. Skipping this step would only lead to inaccuracies during our analysis later on.
The Public Health Data can be downloaded into?.xlsx?files which we can then import using?pandas.
As previously mentioned, the two key indicators we want to focus on are?Obesity Rate?and?Healthy Adult Percentage?(or the Percentage of Adults who meet Recommended Guidelines for Physical Activity). Luckily, the Department of Public Health has this data split by indicators. All we have to do is bring them in and combine.
Next up, we need to get the boundaries.
A GeoJSON is a type of JSON format that is stored as a dictionary with co-ordinate data for polygons that will form into boundaries for any geographical region — essentially a mixture of spatial and non-spatial attributes for any location. In our case, this would be for LA County.
The data that came in from these sources was a bit messy. After ages of tinkering (which I won’t go into detail here) and staring into the screen, we were able to clean it up to represent well.
Getting Geographical Coordinates
To visualize Los Angeles on a Folium map, we need to get it’s co-ordinates first. This is where?geopy?and?geocodercomes in handy. These packages are able to convert geographical location names into their respective lat-long values. Pretty handy.
Our DataFrame’s contain names of Neighborhoods as well. Let’s run them through a loop with the packages to get the necessary spatial values.
The Magic of Foursquare
The Foursquare API is a giant database of every public point-of-interest you may be aware of, especially for the U.S. Having this information is a valuable tool when it comes to generating insights for any industry.
To get going with the API, you need to create a Dev account with the platform first. Once logged in, the platform will allow you to create a new app. This new app will give you two key pieces of information,?CLIENT_ID&?CLIENT_SECRET, which you need to later copy to your code.
CLIENT_ID?= This will be your Foursquare ID
CLIENT_SECRET = This will be your Foursquare Secret
More info on setup can be found?here.
Now, onto the API Calls.
Once we have the required information declared, we need to make the API calls to get the venue information from the server.
You can do a simple call to get all the venues for a specific set of co-ordinates using the following url
url = https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}
If you want to filter the data and obtain venues for only specific types, such as in our case, Fast Food Restaurants and Fitness Centers, you can refer to their Venue Category?documentation?which lists all the ID’s of each category.
You can then change the url to include each category type.
url = https://api.foursquare.com/v2/venues/explore?categoryId={}&intent=browse&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}
where the?categoryid={}?allows you to specify which ID’s you want to request.
Oh and one more thing to keep in mind, The API has a?limit?on the number of calls you can make per day on the Sandbox Tier Account. Bear that in mind for large datasets.
Let’s take a look at the API Data we received.
Great job! You’ve successfully got all the heavy-lifting out of the way. Now onto the fun part.
Plotting the Maps using Folium
folium?is a great visualization tool when it comes to geospatial data. What I love about it is how easily you can visualize your information in multiple ways.
Let’s look at a simple map generated with?folium. There are several map types as well, which?folium?calls, tiles. They each serve a different purpose aesthetically or functionally.
For example, stamen terrain helps visualize the vegetational levels of each location while stamen toner, may introduce sharp contrasts between land-bodies and water-bodies.
To start with on our data, let’s plot a simple map with Folium using just the lat-long values of all the Neighborhoods in our DataFrame.
领英推荐
We can see how each of the Neighborhoods are spread out on the map. Even niftier with Folium is that we can create a custom label for each point as we see with the Neighborhood of Bel-Air here in the Westside Region.
Now, let’s take a deeper look into our Case Study.
Anna wants to maintain a healthy lifestyle when moving to L.A. However, the location of Fast Food restaurants and Fitness Centers is probably not the first thing on her mind when it comes to considering Neighborhoods.
But, given that we’ve already seen how much advertising prevails in the United States with Billboards everywhere and many in front of Fast Food Restaurants, we want to avoid areas with a high density of Fast Food Restaurants. We also aren’t then necessarily looking for an area with high density of Fitness Centers. Because, a Neighborhood that may contain a high density of Fitness Centers may also contain an equally high density of Fast Food restaurants as well. This would be counter-productive to our approach.
We want to isolate areas that would have a low density of Fast Food restaurants and the presence of at least a few Fitness Centers.
There are?two?assumptions that we base our approach on:
Let’s see what the data shows.
Now, to map every little fast food joint and gym on the map would probably blow up my computer. Hence, we’ve acquired a smaller sample size from the API.
The blue points represent Fitness Centers and the red points, Fast Food Restaurants.
From this initial observation alone, we can see some clear differences in the way the establishments are spread out.
We see five possible Neighborhoods that have a lower density of Fast Food options compared to the others.
Santa Monica | Manhattan Beach | Rancho Palos Verdes | Downtown L.A. | South Pasadena
Now, it’s easy to display a map showing the Obesity Rates or the Healthy Adult Percentages we’ve collected. We can even tell?folium?to control the size of each point each point that appears on the map. We can have higher rates to show up as larger such as this map below.
This is neat.
But where?folium?really shines is when it comes to Choropleth maps.
What’s a Choropleth map?
A?choropleth map?(from?Greek?χ?ρο? choros?‘area/region’ and?πλ?θο? plethos?‘multitude’) is a type of?thematic map?in which a set of pre-defined areas is colored or patterned in proportion to a statistical variable that represents an aggregate summary of a geographic characteristic within each area, such as?population density?or?per-capita income.2
It’s a great way to visually see how the data is spread across the map.
But, to create a Choropleth map, we need the data of the boundaries of Los Angeles. Luckily, we did that earlier by importing in the GeoJSON file.
folium?seamlessly works with both the geospatial data and our DataFrame so that we can get a visually functional map.
One key element to fill within the folium structure will be the?key_on?parameter. The?key_on?parameter will be a location within our GeoJSON file where the column to represent on the map is selected. Usually, it’s the name of the boundary, and found in a structure location similar to?feature.properties.name
Explore your GeoJSON to get an idea where this may be.
Let’s take a look at how Obesity Rates vary across Los Angeles. You can see that?folium?automatically creates a legend for us as well.
From the Obesity Rate data, we can see that:
The highest Obesity Rates were found in these Neighborhoods
The lowest Obesity Rates were found in these Neighborhoods
Interesting..?San Gabriel Valley?seems to spike on both spectrums. Additionally, Southern L.A. seems to be an area that we might want to avoid.
On the other end,?Manhattan Beach?and the?Westside?areas that we saw upon first impression were possible ideal locations seem to correlate with our Obesity data.
Now, let’s take a look at how Healthy Adult Percentage Rates vary across Los Angeles.
From the Healthy Adult data, we can see that:
The highest Healthy Adult Percentage Rates were found in these Neighborhoods
The lowest Healthy Adult Percentage Rates were found in these Neighborhoods
That’s a third hit for?Manhattan Beach! Could this be an ideal location for Anna? The other Westside areas seem promising too.
Finally, let’s take a look at one more cool feature of?folium.
Heat Maps.
Yes, you heard that right. I don’t know why but I’ve always had a fascination with heat maps growing up. Maybe it’s their glorified use in many movies that we’ve come to watch growing up. They’re also simply, easy to understand and that’s important when it comes to helping our audience connect with the data.
To get a more accurate visual using a Heat Map, we’ve run another API call to get a larger sample size of just the Venue Categories involving Gym/Fitness Center and Fast Food Restaurants. This would be using the method isolating the category ID’s. Bear in mind the larger loading times and save your work accordingly.
Some of the Neighborhoods that stand out in the Heat Map for Fast Food chains are:
Note: We are looking for green spots as our ideal since we want a lower density of Fast Food chains for Anna.
Some of the Neighborhoods that stand out in the Heat Map for Fitness spots are:
Recommendations & Conclusion
From our observations, it’s clear that there are a few Neighborhoods that seem to stand out as ideal for Anna.
Some ideal choices would be Neighborhoods on the West Side of L.A. such as the Santa Monica Area, the Hollywood Hills, Rancho Palos Verdes, Manhattan Beach, or further down in the South such the Long Beach area, or in the North such as the South Pasadena area.
These communities have a low Obesity Rate and a high Physical Activity rate among its inhabitants prompting Anna to engage more in a healthier lifestyle. They also tend to have a lower number of Fast Food options while boasting a decent amount of Fitness options as well. This would make her drives to/from a workout probably less distracted by the large number of establishments which may hinder her health goals.
If Anna can afford a more expensive lifestyle, she can choose to move to areas such as Santa Monica or Hollywood Hills. If she wants to reduce costs, she can consider areas such as Manhattan Beach, South Pasadena, or Ranchos Palos Verdes.
Overall,?Manhattan Beach?seems to be the best location for her first move to L.A.
The approach we took can definitely be taken further or reconsidered. Some additional things to consider:
The Foursquare API allows for great analysis on many geospatial problems. Other situations where it could be useful when answering questions include identifying the ideal location for a business such as a restaurant, identifying the safest areas to reside, and so on.
Thank you
If you read this far, Thank you. It truly means a lot that you took the time to read through this project of mine. ?
I hope you can agree with me how cool it is that we can use code to find interesting solutions to problems we may not even have thought of.
If you'd like to have a discussion on these, feel free to leave a comment or reach out to me!
Oh, and the source code can be found on my github.
Cheers!