The London Pleasantness Index
Niamh Kingsley | Clapham Common

The London Pleasantness Index

As part of the completion of my Professional Certificate in Data Science I have been working on a project that not only utilizes some of the skills required for the course, but that may be of interest to other people as a collaborative piece.

To see the index in action (and follow updates I make in the future), as well as the full report please check out: https://github.com/kingnif/Coursera_Capstone/blob/master/londonPleasantnessIndex.ipynb

1.      Introduction

London – with a population fast-approaching 9 million and 73 constituencies across 33 boroughs, it’s certainly a big place. It’s not a surprise then that deciding which neighbourhood to rent in, or buy a first home in, is an especially daunting task. Whether you are looking to settle in London, or are an estate agent trying to market particular boroughs based on their strengths, understanding which boroughs stand out against certain metrics is useful.

It is not difficult to find out which areas are generally good for commuting, or for schools, but what about neighbourhoods that people are happy to live in, that have lots of park space, that are good for socialising? What about boroughs that are just quite “pleasant”?

For this research, I have decided to create a simple Pleasantness Index based on three factors:

·    Happiness of local people

·    Amount of green space (parks and trees)

·    Number of coffee shops within walking distance of the centre of the borough

I decided to try and define something that I have not seen before and hope that this would be useful for people who struggle to find a reliable indication of which areas are nice (based on just these factors), without considering the usual factors such as whether there are schools nearby, or the number of bus links. Perhaps it could also be leveraged by local councils and estate agents.

No alt text provided for this image

Figure 1: the 33 boroughs of London represented with folium

2.      Data Selection

I used two different sources to supply data for the three factors.

Firstly, I used recent data (2018/19) from the London Datastore to rank each of the 33 London boroughs based on the reported happiness of people who live there. The report provides a mean score based on a scale of 1-10, with 10 being the highest reported score [1].

I also used the London Datastore to find out what percentage of the borough is “green” (based on parks, trees) [2].  I chose these data sets because they are readily available in a consistent .xls or .csv format, and because it is relatively recent.

Secondly, to determine how many coffee shops are within 1km of the centre of the borough, I used the Foursquare API. I used a geocoder iteration to define a longitude and latitude for each borough, before passing each unique Foursquare request url to the API. A simple code was leveraged to count the results from each call and assign this back to each borough.

3.      Methodology

In order to select the factors, I first explored data available on the London Datastore to understand what good quality, and recent data sets were available for use. I also drew on knowledge of new-build houses, which shows that young first-time buyers look for green spaces in developments, on-site coffee shops, and a friendly local culture. This analysis led me to select three factors that I felt were appropriate for many different stakeholders, and lays the foundation for further research based on additional factors.

To prepare the data, I took mean scores for reported happiness and the percentage of a borough that is “green” and applied ranking code to assign a number to each borough from 1 to 33, with first place being the most positive result for that factor (happiest residents, highest percentage of green space).

In order to pre-process the data for the Foursquare call, I used a geocoder to assign a latitude and longitude to each borough, representing this on a map to check the accuracy. As part of the processing I also dropped any columns not required and made sure the data was usable. One example of where I needed to amend the data was when I first ran the geolocator, I found that some of the results produced were not in the United Kingdom, so I went back and added “, London” to each Borough to ensure that the geolocation produced was based on a more accurate address.

from geopy.extra.rate_limiter import RateLimiter
locator = Nominatim(user_agent="foursquare_agent")


# delay between geocoding calls
geocode = RateLimiter(locator.geocode, min_delay_seconds=1)


# add location column
dfM['Geolocation'] = dfM['Borough'].apply(geocode)


# create longitude and latitude, split into separate columns
dfM['point'] = dfM['Geolocation'].apply(lambda loc: tuple(loc.point) if loc else None)


dfM[['Latitude', 'Longitude', 'Altitude']] = pd.DataFrame(dfM['point'].tolist(), index=dfM.index)
dfM = dfM.drop(["Altitude"], axis=1)
dfM.head()


# Nice! These all look to be London Latitudes!

Figure 2: an example of the pre-processing I used prior to calling the Foursquare API

By creating a unique url for each borough based on the search query (“Coffee”), my credentials, and the right latitude and longitude, I was able to make 33 separate calls to the Foursquare API. I then assigned a unique variable to each borough with a count of the results (coffeeResultsCountn where n is the borough index). This allowed me to then assign a rank to each borough again, this time based on the number of results returned.

4.      Results

I found the results to be interesting, as the top five most “pleasant” boroughs are not entirely what I anticipated. That being said, these are five boroughs that I understand to be popular among young people and first time buyers, although there is likely to be quite a large range of house prices across the results.

Each of the boroughs is outside of central London, with Croydon, Havering, Kingston upon Thames, and Hounslow falling largely in zones 5, 6, 6, and 5 respectively. The exception here is Ealing which is largely zone 3 which, although popular for first time buyers does have some notably expensive areas.

No alt text provided for this image

Figure 3: showing the final top 5 rankings

5.      Discussion

I would like to take this investigation further, and introduce more factors to the index. I intentionally chose three simple factors, but recognise this is based on my own interpretation of what “pleasant” means. I have no doubt that the results would be significantly different if rather than considering coffee shops, I had decided to look at kebab shops or hairdressers or Porsche dealerships!

Going forward then, it would be better to define a more standardised meaning of “pleasantness” by surveying people, and then looking to include more factors to meet that expectation. I feel that this index is fit for purpose as a starting point and could be useful for renters, first-time buyers, retired people, estate agents or local councils, but it could become much more useful and reflective of a broader interpretation if more considerations were included.

6.      Conclusion

In conclusion, this investigation is very much a starting point when it comes to understanding which boroughs are “pleasant”. My own bias and (simple) analysis led me to choose three factors that have been important to me when looking to buy a house, and I have no doubt for many other people these factors are much further down the list, or not important at all.

Therefore in order to take this further, I could look to adopt my own recommendations from the discussion and build out a notion of what is “pleasant” based on wider survey. Including more factors will help to narrow down the scores in the boroughs and make the research more useful for potential stakeholders.   

You can find the full results and follow any updates to the index by checking out the Github link at the top of the article.

[1] Personal Well-being (Happiness) by Borough, Office for National Statistics https://data.london.gov.uk/dataset/subjective-personal-well-being-borough

[2] London Green and Blue Cover, Greater London Authority https://data.london.gov.uk/dataset/green-and-blue-cover



Mark Pearce

Director | MACH Technologies | Delta Capita

4 å¹´

I used to live in the very pleasant borough of Croydon (close to the long-defunct Croydon Airport), so not surprised to see it in the top 5. How about adding traffic volume as an additional factor? Great work!

Rahel Haque CA

Climate and ESG Capital Markets - Private Finance Lead

4 å¹´

Stumbled across this but doesn’t surprise me Croydon is your number 1 - always underrated and full of greenery that will only be more appreciated in the post-COVID world. Nice work!

Tahir Zafar

Banking and Capital Markets Advisory | Strategy, AI, Data and Digital Transformation

4 å¹´

apart from not seeing my borough in the Top 5 this is awesome - epic work

Dr Stuart Haw

Lecturer in Health Studies

4 å¹´

Brilliant idea and great metrics

要查看或添加评论,请登录

Niamh Kingsley的更多文章

社区洞察

其他会员也浏览了