Practical Natural Language Processing for Getting Good Wifi in Hostels

Practical Natural Language Processing for Getting Good Wifi in Hostels

I was planning my trip to Amsterdam in January and was looking through hostels in Hostel World filtering for different features and amenities. One amenity that I thought I would definitely need was free wifi if I wanted to do some programming from the hostel and also just because life demands it in general. While there’s a ton of hostels that offer free wifi, I’ve definitely been at the end of the stick where the quality of wifi has been unmentionably bad. This probably goes for hotels as well as hostels, but generally hostels are cheaper and offer less in the way of complementary services.

That got me thinking about creating an interesting application that could judge the quality of wifi in reviews. Randomly I decided to spin up a new idea for a scraping/api for Hostel World where I could actually find the reviews that mention wifi and other amenities that would be useful. Instead of meticulously scanning through hundreds of reviews, I could just scrape the reviews, parse out keywords, and assign sentiment scores to each review.

Eventually I made it into a Twitter Bot at  @HostelReviewBot

Heh, I am replying to myself. Try it out yourselves!

Mention @HostelReviewBot and link a hostel from Hostelworld and include the word wifi, breakfast, noise, bathroom, or shower.

The positive and negative refers to the number of positive  and negative sentiment reviews respectively. The quote is picked from being an overall average of common words mentioned when scraping the site. Overall there’s too much information that can’t really get stuffed into 140 characters which is quite a sham. I should learn how to create a quick flask api. Maybe that’s for later.

Let’s go through a quick tutorial of some python stuff.

The idea is to first create a list of amenities that we would like to track from each hostel. I can think of five pretty important things that a hostel or a hotel should have that aren’t rated on Hostel World or another review site like Tripadvisor. Amenitiesis a dictionary of key value stores where the values are chained synonyms that could be used in text to describe our amenities. That way if someone mentions or misspells “wifi” with “wi fi” or “wi-fi” or just another definition like “internet”, we can track their opinion. I welcome any more ideas that could be things to track that right now can be ambiguous or require reading-reviews-effort.

READ MORE FOR PYTHON STUFF AT NLP For Hostel Reviews To Twitter Bot

Eric Tang

AI + People + Technology Leader | Engineer

9 年

Always coming up with new ways to think. Keep it up!

回复
Charlyn Gonda

Software Engineer | Creative maker | Public speaker | Chronic problem solver | xoogler, xuber

9 年

Oh you do have it! Fantastic!

回复
Charlyn Gonda

Software Engineer | Creative maker | Public speaker | Chronic problem solver | xoogler, xuber

9 年

I'd be interested to see the sample size that the sentiment score came from - a score of 97 from 3 people seems less accurate than a score of 81 from 100

回复
Charlyn Gonda

Software Engineer | Creative maker | Public speaker | Chronic problem solver | xoogler, xuber

9 年

This is awesome! :D

回复

要查看或添加评论,请登录

Jay Feng的更多文章

社区洞察

其他会员也浏览了