Using Open Source Data and Python Data Analysis to Measure Urban Nightlife Intensity in Tel-Aviv
Omri Shaffer
Transportation Planner | Strategic, Master Plans, Networks, Micromobility, Public Transit, Street Plans, Urban Planning, GIS, Data, Dashboards, Models, Forecasts, Python, TransCAD, Emme, GISDK
I am presenting this article as part of a capstone project submission, as required by the Coursera IBM Data Science Specialization. In it I will shortly show you how I used python and open source data sources to measure the intensity of nightlife activity in Tel-Aviv. I admit that at first I wasn't sure if this could ever be useful, but after seeing the results I was inspired professionally to follow this lead, as I will explain later.
Possible stakeholders for such information include, for example:
- Families looking for a quiet neighborhood to live in
- Young adults looking for a neighborhood with rich nightlife
- An entrepreneur looking to open a late-night venue in an area with a large amount of such activity already
Based on the goals of this project we will require an open data source with information about venues, including their type, location, and closing hours.
For this we will use the following sources:
1. Google Geocoding API to retrieve locations based on names
2. Foursquare API to retrieve all the venues and related information: category, location, and closing hours
3. Neighborhood polygons from the Tel-Aviv Municipality GIS website
4. Open Street Maps for displaying the data in geographic context
Here are all 509 venues which were extracted from Foursquare:
Here is a sample of the data which was displayed in the map:
After querying Foursquare again for each venue in the list many venues were found to miss closing hours data. Those had to be excluded along with venues which after calculation their average closing hour was found to be earlier than 18:00 PM. Finally only 193/507 (~39%) remain:
Sample data: (Please ignore the date in the timestamp. Only the clock time is relevant.)
Methodology
In this project we seek to analyze the closing hours of venues in the various neighborhoods of Tel-Aviv. We are interested in venues of the following categories: Arts and entertainment, Food, Nightlife Spot, Outdoors & Recreation, Professional & Other Places, and Shop & Service.
The venues will be associated to their corresponding neighborhood, based on the venue's location in relation to the neighborhood polygons. The average closing hour will be calculated for each venue and that value will finally represent it.
We will analyze the data using choropleth maps to see what the average closing hour is for each neighborhood. More interestingly, by adding up each venue's closing hour in each neighborhood, we get a good measurement for not only the average closing hour, but also for the intensity of activity at late hours, since that way the more venues there are, and the later they close, their score rises higher. We could also have multiplied each average value by the amount of venues.
Finally, we will cluster the neighborhoods based on the variation in venues' closing hours. The amount of venues in each round hour will undergo k-means cluster analysis.
Analysis
(First Ever?) Tel-Aviv nightlife intensity map:
Note that in black there was no data (meaning no venues known to be open after 18:00.)
Here are the (top 15) most intense neighborhoods, with their scores, as shown in dark green in the map:
I am very curious to know what you think! Does it match your experience with Tel-Aviv?
Personally, I was surprised how accurately the findings match my own experience.
Next, I performed K-Means Cluster Analysis, a "machine learning" technique, which is a prominent member of the unsupervised learning algorithm family. I chose 5 (k=5) categories for the analysis. The analysis was performed on the hourly distribution of closing hours, within each neighborhood. This was meant to see if we can find some insights into the hourly distribution of the intensity, as opposed to the overall intensity.
Here are the results:
An important note before we continue:
It is important to understand the limits of this analysis. Mainly, we don't know how reliable and statistically credible the Foursquare data is. In addition, the data we have collected only tells us how many venues are still open at late hours, but not how intense the actual activity in them is. This way a large venue that hosts hundreds can be counted the same as a small bar with dozens. However, it perfectly matches my experience with Tel-Aviv and that is a good direction. Further research and comparison with other data sources should be pursued.
Conclusion
We set out to discover and analyze the nightlife intensity in Tel-Aviv, by neighborhood. In conclusion, we found there is a great variety in the intensity of nightlife throughout the city, while most of it is concentrated around the center. Yet, nightlife was found also in some peripheral areas so it reminds us to always study the data and not to rely on our hunches.
From the initial data exploration, with help from the choropleth map, is seems that the strongest nightlife intensity in Tel-Aviv is in the neighborhood '?? ????' (in the yellow category, in the center.) Later, the K-means Cluster Analysis singled out a different neighborhood, '????? ????-???? ??????' (in the blue category.)
From inspecting the most intense clusters (above), we can see that labels 1 and 3 were applied to the most intense neighborhoods in the city. It also appears that the unique neighborhood, given label 3, was singled out for having a very intense nightlife scene (like some others in the yellow category - label 1,) while also closing relatively early, mostly by around 22:00 PM. This may be a very interesting conclusion for our stakeholders. Perhaps people looking for a vibrant neighborhood to live in, but not one that will keep them up too late. University students for example. I would probably not have noticed that myself, or at least not in an efficient amount of time, and it shows what potential these great tools have for understanding data and the real world that it describes.
It seems that "machine learning" algorithms have a lot to offer us and more exploration into the matter shall be pursued. What about the professional inspiration I mentioned? Well, in transportation planning, though necessary, it is often difficult to get quality data about urban activity. This is especially true for nightlife which almost completely lacks data for planning and decision making. All while interest in Israel has grown for providing transit service at night, with many special night lines running in cities around the country, mainly to serve nightlife activities and keep the alcohol away from the wheel. Good accessible data will be a great service to such import tasks, so I was inspired to pursue this kind of research in the future.
Thank-you for reading! I would love to know what you think.
Founder @Mind The Map | Web GIS Developer at Tel Aviv Municipality
5 年That's some great data! Nice!
Corporate Finance & Valuation @ Atwell | Making the Complicated Simple
5 年This kind of analysis holds a lot of value for urban planning, if only we had enough data.? The results are debatable when you perform a? 'big data' type analysis on a limited data set. However, given the huge potential you uncovered even in a small drill, It's clear that cities will benefit from the collection of nonpersonal yet detailed data. The classic candidates for supplying such info are obviously businesses. While big businesses will take care of their digital presence internally, small businesses wouldn't generally do? (you can test if I'm right by analyzing the data you already collected). Hence, I see a mutual benefit in the city helping to set up small businesses google entry (for example)?
Consultant
5 年Proud of you son! Great work !
COO @ The Reeder | VC | Advisor
5 年@Kiran Vajapey?- this was an interesting read
CEO@Amy Metom | Stanford GSB
5 年Very interesting initiative.