2022 Beijing Winter Olympic Games: Doing a Sentiment Analysis of the tweets
How can we analyze the tweets in almost real-time? Communication during the Olympic Games is fantastic because many eyes around the world are aimed into this massive event. Let's use the Sentiment Analysis to measure the reception of the Games on Twitter.
Olympic Games Context Introduction
Right now, in Beijing, China, the XXIV Olympic Winter Games are happening. With the motto "Together for a Shared Future", 2871 athletes from 91 nations will be competing in 15 different sports.
Of course, we have people in favor and people against the Games, and we won't discuss that. But, what we cannot deny is that we are in the most politicized Games since the Cold Ward era. With the USA announcing a diplomatic boycott, plus the demonstrations by Tibetan & Uyghurs independence groups, we need to add the situation in regard to the COVID-19.
On the other hand, according to the IOC Marketing & Broadcasting report, in Tokyo 2020 there were "6.1B engagements (likes, comments, shares and video views on Olympic post) on Olympic social media handles, across 9 social media platforms", from 25 Feb 2020 to 05 Sep 2021 (sources at the end of the article)
These interactions were positive and negative, but interactions at the end. We need to remember that there was an important piece of the public opinion that was against the realization of the Tokyo 2020 Olympic games due to the COVID situation. But, once the Games started, the opinion changed:
As we can see in this graph from NTT Data, there was a turnover regarding the ratio between positive and negative tweets once the Olympic Games started. This tells us that it is important to analyze the data, understanding the context. This includes when the data was produced: when the tweet was posted, in this case.
Introduction to Sentiment Analysis
In this article, I would like to show that making this analysis is not difficult at all, and we all have the tools in our hands.
In this case, we will analyze 30.000 tweets scraped from Twitter today, February 6th, a few hours before I post this: from 9:30 to 18:55hs. The scraped tweets must fulfill two conditions to be accepted: have the word 'olympics' as part of the text and be in English.
Using the NLTK Python's library, plus a tokenization process, we will:
Note: This is just rough analysis. We need to take into consideration that there is bias in the counting for two main reasons:
Note 2: all the analysis was made taking the time GMT+1 (European Standard Time)
Analyzing Olympic Tweets
Let's start.
From the 30.000 tweets, after cleaning them, we got this value counts.
This is not very helpful for coming to any conclusion, so we do a sentiment analysis of each tweet, and rank it positive, neutral, or negative.
领英推荐
We get that from 9:30 to 18:55 there isn't any negative peak. In fact, we have a positive sentiment overall. At least from the period of time we are analyzing.
Let's make our time range smaller and only analyze the tweets from 10 to 11:00hs. It is very interesting to see the peaks, and the clustering
Crossing the information with the schedule, we can see that at 9:30 the Speed Skating Finals were happening, while at 9:40 the Ice Hockey game between JPN and CHN started. This could be a reason why we have that many tweets between 10:00 and 10:20. Data with context becomes information... and that is what we are looking for.
Positive or Negative?
Now... the tweets in general, were they positive or negative?
It's very interesting to see how positiveness beats negativeness. Neutrality is understandable, as many news may be objective, and with that, neutral.
Words Clouds
Finally, one of the most amazing tools we have found in marketing is the Word Cloud. What are the words we find (along with 'olympics') in the positive tweets?
And, what are the negative ones?
Fascinating, right?
Thanks for reading!
Sources: https://stillmed.olympics.com/media/Documents/International-Olympic-Committee/IOC-Marketing-And-Broadcasting/Tokyo-2020-External-Communications.pdf
---------------------------------------------------------------------------------------------------------------
Author:?Ignacio Ariznabarreta -?JIAF Consulting
Docente
3 年Felicitaciones I?aki!!!! Muy interesante tu análisis. Un abrazo