If you are not tracking your brand sentiment, here’s HOW you should do? - Guide to Sentiment Analysis in R
Vishal Bagla
Building AmbitionBox | VP - Product | Ex - Consultant | IIM C | INDmoney | Alvarez & Marsal | American Express | Oracle
Sentiment Analysis in R Using Twitter API
This post is a continuation of the first post: If you are not tracking your brand sentiment, here’s WHY you should do?. You may go through the first post for a better understanding of this post.
Even since the first tweet was made, Twitter has become a platform to express opinions, views, news, complain or even have short conversations. There are roughly 500 million tweets sent each day (Source – SocialPilot). Imagine the value of this data. If harnessed in the right manner, this data can provide insights at a level which was unthinkable a few years ago. In this post, I will talk about mining sentiment on tweets posted by a Twitter handle. This post talks about creating a Twitter app, integrating API with R and then mining the sentiment of tweets after basic data cleaning. Also, this has been written in a manner to act as a guide while implementing in R.
This post is divided into four sections.
1.?????Creating a Twitter App
2.?????Integrating R with Twitter API
3.?????Data Cleaning in R
4.?????Sentiment Analysis
We will discuss each of the four steps in detail and see the implementation side by side.
1.?????Creating a Twitter App
First step is to register yourself on www.apps.twitter.com and create an app so that you get the required credentials to fetch data in R.
Once you login and click on Create New App, you will see a screen like below. Fill in the details depending on your requirements.
After completing all the details, click on Create Your Twitter Application and your app will be created. In the app section, you will get a page as below.
Here, all your account details will be present and you will be able to access the keys and access token. You will need keys and access token to integrate your app with R so as to fetch data from a Twitter feed.
Please DO NOT share your keys and access tokens with anyone for security and privacy reasons.
Now, you have created an app in Twitter and have access to all the keys and tokens, let’s move to the second part which is implementation in R.
2.?????Integrating R with Twitter API
From your Twitter app, you should have four things handy with you before we proceed further.
??????i.?????????Consumer Key (API Key)
?????ii.?????????Consumer Secret (API Secret)
????iii.?????????Access Token
????iv.?????????Access Token Secret
Now, let’s try to connect R with Twitter using a package in R – twitteR. According to the package description, twitter provides an interface to the Twitter web API. We will use the below R code to connect to the Twitter API and extract the tweets and convert them into a data frame.
install.packages("twitteR")
library("twitteR")
Now, using access and consumer keys and tokens let’s connect R with Twitter and read tweets from the twitter handle of Forbes magazine (@Forbes). Please note that this Twitter handle is taken only for illustrative purpose.
# Please note that the below keys are randomly generated. Please enter your keys and access token generated by Twitter App.
consumer_key <- '0123456789ABDEGH12345670'
consumer_secret <- '987654321ABCDEFH12347890'
access_token <- '654387EFGHI1234567890ASDFGH'
access_secret <- '01458jhgfdswABHI1237890'
setup_twitter_oauth(consumer_key, consumer_secret, access_token, access_secret)
Yes
forbes_tweets <- userTimeline("Forbes", n=200)
setup_twitter_oauth authenticates the Twitter app credentials post the connection, you can start extracting tweets from your desired Twitter handle.
> length.tweet <- length(forbes_tweets)
> length.tweet
[1] 191
There are certain limitations to the number of tweets and time history you can go back to while extracting tweets. We will not go in to those details in this post.
The above output shows us that the number of tweets extracted from the Forbes Twitter handle is 191.
dataframe.tweets <- twListToDF(forbes_tweets)
sample_dataframe = head(dataframe.tweets, n=5)
sample_dataframe
The output of sample_dataframe looks like below:
At this stage, we have extracted tweets from a Twitter handle and put them in a data frame in R for further analysis.
Now, let’s move to the third step of data cleaning part.
3.?????Data Cleaning in R
We will use a R package called “tm” for text mining and text cleaning. ‘tm’ package has multiple functions that helps us in cleaning text data so as to convert it into a structured format for further analysis.
install.packages("tm")
library("tm")
Before going further, let’s see what are the variables available in the dataframe.tweets data frame. We will use only those variables which are of use to us and discard the remaining for ease of understanding.
> str(dataframe.tweets)
'data.frame':? 191 obs. of? 16 variables:
$ text???????? : chr? "Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?' \nhttps://t.co/97rwKPRAWf https://t.co/cxkx15oOIz" "Introducing the world's largest household products and personal care companies of 2018:\nhttps://t.co/XIIgqNgwL"| __truncated__ "Index Ventures expects $2B fintech investments in startups like iZettle and Adyen\nhttps://t.co/RlVxn1wJQq http"| __truncated__ "Skin-whitening craze is popular in Asia --and the main ingredient is snail slime\nhttps://t.co/aQ14vzeyer https"| __truncated__ ...
$ favorited??? : logi? FALSE FALSE FALSE FALSE FALSE FALSE ...
$ favoriteCount: num? 27 39 31 60 32 38 120 33 50 119 ...
$ replyToSN??? : chr? NA NA NA NA ...
$ created????? : POSIXct, format: "2018-06-09 09:00:01" "2018-06-09 07:30:01" "2018-06-09 07:00:01" "2018-06-09 06:00:03" ...
$ truncated??? : logi? FALSE TRUE FALSE FALSE FALSE FALSE ...
$ replyToSID?? : chr? NA NA NA NA ...
$ id????????? ?: chr? "1005373942335442948" "1005351293282607104" "1005343743619280896" "1005328650936385537" ...
$ replyToUID?? : chr? NA NA NA NA ...
$ statusSource : chr? "<a href=\"https://www.sprinklr.com\" rel=\"nofollow\">Sprinklr</a>" "<a href=\"https://www.sprinklr.com\" rel=\"nofollow\">Sprinklr</a>" "<a href=\"https://www.sprinklr.com\" rel=\"nofollow\">Sprinklr</a>" "<a href=\"https://www.sprinklr.com\" rel=\"nofollow\">Sprinklr</a>" ...
$ screenName?? : chr? "Forbes" "Forbes" "Forbes" "Forbes" ...
$ retweetCount : num? 21 12 15 43 12 12 49 12 27 48 ...
$ isRetweet??? : logi? FALSE FALSE FALSE FALSE FALSE FALSE ...
$ retweeted??? : logi? FALSE FALSE FALSE FALSE FALSE FALSE ...
$ longitude??? : logi? NA NA NA NA NA NA ...
$ latitude???? : logi? NA NA NA NA NA NA ...
There are 16 variables in the data frame; however, we will use only one, i.e., text variable which contains the actual tweeted text.
Let’s first see how our actual tweets look like before we do any further processing.
领英推荐
> head(dataframe.tweets$text, n=10)
[1] "Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?' \nhttps://t.co/97rwKPRAWf https://t.co/cxkx15oOIz"???????????????
?[2] "Introducing the world's largest household products and personal care companies of 2018:\nhttps://t.co/XIIgqNgwLH… https://t.co/fikAPxAHh3"
[3] "Index Ventures expects $2B fintech investments in startups like iZettle and Adyen\nhttps://t.co/RlVxn1wJQq https://t.co/MWgsxr0VdM"??????
?[4] "Skin-whitening craze is popular in Asia --and the main ingredient is snail slime\nhttps://t.co/aQ14vzeyer https://t.co/Pm2iChIPra"???????
?[5] "https://t.co/wCjrJBkFAT"?????????????????????????????????????????????????????????????????????????????????????????????????????????????????
?[6] "Impact PartnersVoice: Finding added peace of mind in retirement https://t.co/K5f58kcBRE https://t.co/MsxEg1xi52"?????????????????????????
?[7] "Apple has finally released a Walkie-Talkie function for the Apple Watch\nhttps://t.co/RpR6CuOmkl https://t.co/IbUkPxmFAC"????????????????
?[8] "BraintreeVoice: How Braintree Extend takes the headache out of contextual commerce https://t.co/jTn7gl1mEq https://t.co/qYa0mNXWc3"??????
[9] "5 financial myths, and what you should know instead:\nhttps://t.co/mftcpytphN https://t.co/N75Me1P06H"
[10] "How much Gen X and millennials should have saved at every age: \nhttps://t.co/eCI7vkcZVc https://t.co/CFnJugRnRo"??
In the above output, we can see the ‘text’ variable contains tweeted content plus other content such as URLs, stop words, etc. We will remove all the unwanted content before we do sentiment analysis.
> # Remove URLs starting with https
> dataframe.tweets2 <- gsub("https.*","",dataframe.tweets$text)
> head(dataframe.tweets2, n=10)
[1] "Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?' \n"??????????????
?[2] "Introducing the world's largest household products and personal care companies of 2018:\n"
[3] "Index Ventures expects $2B fintech investments in startups like iZettle and Adyen\n"?????
?[4] "Skin-whitening craze is popular in Asia --and the main ingredient is snail slime\n"??????
?[5] ""????????????????????????????????????????????????????????????????????????????????????????
?[6] "Impact PartnersVoice: Finding added peace of mind in retirement "????????????????????????
?[7] "Apple has finally released a Walkie-Talkie function for the Apple Watch\n"???????????????
?[8] "BraintreeVoice: How Braintree Extend takes the headache out of contextual commerce "?????
?[9] "5 financial myths, and what you should know instead:\n"??????????????????????????????????
[10] "How much Gen X and millennials should have saved at every age: \n"
Here, we have removed URLs in the text – removed any text which was suffixed to https.
I would recommend you to compare the text after every step with the original tweeted content to appreciate the change we achieve through the code.
Tweet#5 is coming out black with just quotation marks. Can you find out why?
> # Remove URLs starting with http
> dataframe.tweets2 <- gsub("http.*","",dataframe.tweets2)
> head(dataframe.tweets2, n=10)
[1] "Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?' \n"??????????????
?[2] "Introducing the world's largest household products and personal care companies of 2018:\n"
[3] "Index Ventures expects $2B fintech investments in startups like iZettle and Adyen\n"?????
?[4] "Skin-whitening craze is popular in Asia --and the main ingredient is snail slime\n"??????
?[5] ""? ???????????????????????????????????????????????????????????????????????????????????????
?[6] "Impact PartnersVoice: Finding added peace of mind in retirement "????????????????????????
?[7] "Apple has finally released a Walkie-Talkie function for the Apple Watch\n"???????????????
?[8] "BraintreeVoice: How Braintree Extend takes the headache out of contextual commerce "?????
?[9] "5 financial myths, and what you should know instead:\n"??????????????????????????????????
[10] "How much Gen X and millennials should have saved at every age: \n"
> # Remove hashtags - starting with #
> dataframe.tweets2 <- gsub("#.*","",dataframe.tweets2)
> head(dataframe.tweets2, n=10)
[1] "Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?' \n"??????????????
?[2] "Introducing the world's largest household products and personal care companies of 2018:\n"
[3] "Index Ventures expects $2B fintech investments in startups like iZettle and Adyen\n"?????
?[4] "Skin-whitening craze is popular in Asia --and the main ingredient is snail slime\n"??????
?[5] ""????????????????????????????????????????????????????????????????????????????????????????
?[6] "Impact PartnersVoice: Finding added peace of mind in retirement "????????????????????????
?[7] "Apple has finally released a Walkie-Talkie function for the Apple Watch\n"???????????????
?[8] "BraintreeVoice: How Braintree Extend takes the headache out of contextual commerce "?????
?[9] "5 financial myths, and what you should know instead:\n"??????????????????????????????????
[10] "How much Gen X and millennials should have saved at every age: \n"?
> # Remove if any other Twitter handle is tagged in the post
> dataframe.tweets2 <- gsub("@.*","",dataframe.tweets2)
> head(dataframe.tweets2, n=10)
[1] "Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?' \n"??????????????
?[2] "Introducing the world's largest household products and personal care companies of 2018:\n"
[3] "Index Ventures expects $2B fintech investments in startups like iZettle and Adyen\n"?????
?[4] "Skin-whitening craze is popular in Asia --and the main ingredient is snail slime\n"??????
?[5] ""?????????????????????????????? ??????????????????????????????????????????????????????????
?[6] "Impact PartnersVoice: Finding added peace of mind in retirement "????????????????????????
?[7] "Apple has finally released a Walkie-Talkie function for the Apple Watch\n"???????????????
?[8] "BraintreeVoice: How Braintree Extend takes the headache out of contextual commerce "?????
?[9] "5 financial myths, and what you should know instead:\n"??????????????????????????????????
[10] "How much Gen X and millennials should have saved at every age: \n"?
> # Remove \n coming at the end of the tweet
> dataframe.tweets2 <- gsub("\n*","",dataframe.tweets2)
> head(dataframe.tweets2, n=10)
[1] "Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?' "??????????????
?[2] "Introducing the world's largest household products and personal care companies of 2018:"
[3] "Index Ventures expects $2B fintech investments in startups like iZettle and Adyen"?????
?[4] "Skin-whitening craze is popular in Asia --and the main ingredient is snail slime"??????
?[5] ""??????????????????????????????????????????????????????????????????????????????????????
?[6] "Impact PartnersVoice: Finding added peace of mind in retirement "??????????????????????
?[7] "Apple has finally released a Walkie-Talkie function for the Apple Watch"???????????????
?[8] "BraintreeVoice: How Braintree Extend takes the headache out of contextual commerce "???
?[9] "5 financial myths, and what you should know instead:"??????????????????????????????????
[10] "How much Gen X and millennials should have saved at every age: "
We have cleaned our tweets to the extent that they are ready for us to analyze the sentiments. There are multiple other text mining functions in the ‘tm’ package that can be helpful depending on the quality and structure of the data you may have.
Now, we will move to the fourth part where we carry out sentiment analysis on the extracted tweets and dive further.
4.?????Sentiment Analysis
We will use a package called ‘syuzhet’ in R to carry out sentiment analysis on the tweets.
install.packages("syuzhet")
library("syuzhet")
Syuzhet package works on vectors and not on data frames, so we will have to convert dataframe.tweets2 to vector first and then carry out sentiment analysis.
> vector.tweets <- as.vector(dataframe.tweets2)
> head(vector.tweets, n=10)
[1] "Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?' "??????????????
?[2] "Introducing the world's largest household products and personal care companies of 2018:"
[3] "Index Ventures expects $2B fintech investments in startups like iZettle and Adyen"?????
?[4] "Skin-whitening craze is popular in Asia --and the main ingredient is snail slime"??????
?[5] ""??????????????????????????????????????????????????????????????????????????????????????
?[6] "Impact PartnersVoice: Finding added peace of mind in retirement "??????????????????????
?[7] "Apple has finally released a Walkie-Talkie function for the Apple Watch"???????????????
?[8] "BraintreeVoice: How Braintree Extend takes the headache out of contextual commerce "???
?[9] "5 financial myths, and what you should know instead:"??????????????????????????????????
[10] "How much Gen X and millennials should have saved at every age: "
get_nrc_sentiment and get_sentiment are two function from the ‘syuzhet’ package that we will use in this post.?
> emotion.tweets <- get_nrc_sentiment(vector.tweets)
> head(emotion.tweets, n=10)
?? anger anticipation disgust fear joy sadness surprise trust negative positive
1????? 0??????????? 0?????? 0??? 0?? 1?????? 0??????? 0???? 1??????? 0??????? 1
2????? 0??????????? 0?????? 0??? 0?? 0?????? 0??????? 0???? 1??????? 0??????? 1
3????? 0??????? ????0?????? 0??? 0?? 0?????? 0??????? 0???? 0??????? 0??????? 0
4????? 0??????????? 0?????? 1??? 0?? 0?????? 0??????? 0???? 0??????? 0??????? 1
5????? 0??????????? 0?????? 0??? 0?? 0?????? 0??????? 0???? 0??????? 0??????? 0
6????? 0??????????? 2?????? 0?? ?1?? 2?????? 1??????? 0???? 2??????? 1??????? 2
7????? 0??????????? 2?????? 1??? 1?? 1?????? 0??????? 1???? 1??????? 0??????? 1
8????? 0??????????? 0?????? 0??? 0?? 0?????? 0??????? 0???? 1??????? 1??????? 1
9????? 0??????????? 0?????? 0??? 0?? 0?????? 0? ??????0???? 0??????? 0??????? 0
10???? 0??????????? 0?????? 0??? 0?? 0?????? 0??????? 0???? 0??????? 0??????? 0
The above output presents us the score for different emotions for each of the tweets. Score of 0 means that the tweet is not associated with the particular emotion; while, a score of 1 means the tweet is associated with the emotion. Subsequently, higher score means stronger emotion.
> emotion.tweets2 <- cbind(dataframe.tweets2, emotion.tweets)
> head(emotion.tweets2, n=10)
???????????????????????????????????????????????????????????????????????? dataframe.tweets2 anger anticipation disgust fear joy sadness surprise
1???????????????? Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?'????? 0??????????? 0?????? 0??? 0?? 1?????? 0??????? 0
2? Introducing the world's largest household products and personal care companies of 2018:???? 0??????????? 0?????? 0??? 0?? 0?????? 0??????? 0
3??????? Index Ventures expects $2B fintech investments in startups like iZettle and Adyen???? 0??????????? 0?????? 0??? 0?? 0?????? 0??????? 0
4???????? Skin-whitening craze is popular in Asia --and the main ingredient is snail slime???? 0??????????? 0 ??????1??? 0?? 0?????? 0??????? 0
5????????????????????????????????????????????????????????????????????????????????????????????? 0??????????? 0?????? 0??? 0?? 0?????? 0??????? 0
6???????????????????????? Impact PartnersVoice: Finding added peace of mind in retirement????? 0??????????? 2?????? 0??? 1?? 2?????? 1??????? 0
7????????????????? Apple has finally released a Walkie-Talkie function for the Apple Watch???? 0??????????? 2?????? 1??? 1?? 1?????? 0??????? 1
8????? BraintreeVoice: How Braintree Extend takes the headache out of contextual commerce????? 0??????????? 0?????? 0??? 0?? 0?????? 0??????? 0
9???????????????????????????????????? 5 financial myths, and what you should know instead:???? 0??????????? 0?????? 0??? 0?? 0?????? 0??????? 0
10??????????? ?????????????How much Gen X and millennials should have saved at every age:????? 0??????????? 0?????? 0??? 0?? 0?????? 0??????? 0
?? trust negative positive
1????? 1??????? 0??????? 1
2????? 1??????? 0??????? 1
3????? 0??????? 0??????? 0
4????? 0??????? 0 ???????1
5????? 0??????? 0??????? 0
6????? 2??????? 1??????? 2
7????? 1??????? 0??????? 1
8????? 1??????? 1??????? 1
9????? 0??????? 0??????? 0
10???? 0??????? 0??????? 0
The above output gives us a better presentation – tweets and associated emotions for better understanding.
Now, let’s get a sentiment score for each of the tweets.
> sentiment.score <- get_sentiment(vector.tweets)
> head(sentiment.score, n=10)
[1] 1.60 1.60 0.50 0.05 0.00 0.75 0.40 0.00 0.00 0.80
Higher the score, more positive is the tweet. There’s an inbuilt dictionary in the package which assign a sentiment score to different words, basis which total sentiment score for a tweet has been calculated.
You can create your own word dictionary and associated sentiment score, but that would be humungous task to do.
> sentiment.tweets = cbind(sentiment.score, emotion.tweets2)
> head(sentiment.tweets, n=10)
? sentiment.score?????????????????????????????????????????????????????????????????????? dataframe.tweets2 anger anticipation disgust fear joy
1???????????? 1.60??????????????? Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?'????? 0??????????? 0?????? 0??? 0?? 1
2???????????? 1.60 Introducing the world's largest household products and personal care companies of 2018:???? 0 ???????????0?????? 0??? 0?? 0
3???????????? 0.50?????? Index Ventures expects $2B fintech investments in startups like iZettle and Adyen???? 0??????????? 0?????? 0??? 0?? 0
4???????????? 0.05??????? Skin-whitening craze is popular in Asia --and the main ingredient is snail slime???? 0??????????? 0?????? 1??? 0?? 0
5???????????? 0.00???????????????????????????????????????????????????????????????????????????????????????????? 0??????????? 0?????? 0??? 0?? 0
6???????????? 0.75??????????????????????? Impact PartnersVoice: Finding added peace of mind in retirement????? 0??????????? 2?????? 0??? 1?? 2
7???????????? 0.40???????????????? Apple has finally released a Walkie-Talkie function for the Apple Watch???? 0??????????? 2?????? 1??? 1?? 1
8???????????? 0.00???? BraintreeVoice: How Braintree Extend takes the headache out of contextual commerce????? 0??????????? 0?????? 0??? 0?? 0
9???????????? 0.00??????????????????????????????????? 5 financial myths, and what you should know instead:???? 0??????????? 0?????? 0?? ?0?? 0
10??????????? 0.80???????????????????????? How much Gen X and millennials should have saved at every age:????? 0??????????? 0?????? 0??? 0?? 0
?? sadness surprise trust negative positive
1??????? 0??????? 0???? 1??????? 0??????? 1
2??????? 0??????? 0???? 1??????? 0??????? 1
3??????? 0??????? 0???? 0??????? 0??????? 0
4??????? 0??????? 0???? 0??????? 0??????? 1
5??????? 0??????? 0???? 0??????? 0??????? 0
6??????? 1??????? 0???? 2??????? 1??????? 2
7??????? 0??????? 1???? 1??????? 0??????? 1
8??????? 0??????? 0???? 1??????? 1??????? 1
9??????? 0??????? 0???? 0??????? 0??????? 0
10?????? 0??????? 0???? 0??????? 0??????? 0
In the above output, we have combined tweets, emotion score and sentiment score to have a combined view.
Now, if we were to look at only the positive tweets or only the negative tweets, we can do in manner shown below.
> # Getting positive tweets and associated scores
> positive.tweets <- sentiment.tweets[which(sentiment.tweets$sentiment.score > 0),]
> head(positive.tweets, n=5)
? sentiment.score?????????????????????????????????????????????????????????????????????? dataframe.tweets2 anger anticipation disgust fear joy
1??????????? 1.60??????????????? Is rap superstar Chamillionaire's new app, Convoz the next 'big thing?'????? 0?????? ?????0?????? 0??? 0?? 1
2??????????? 1.60 Introducing the world's largest household products and personal care companies of 2018:???? 0??????????? 0?????? 0??? 0?? 0
3??????????? 0.50?????? Index Ventures expects $2B fintech investments in startups like iZettle and Adyen???? 0??????????? 0?????? 0??? 0?? 0
4??????????? 0.05??????? Skin-whitening craze is popular in Asia --and the main ingredient is snail slime???? 0??????????? 0?????? 1??? 0?? 0
6??????????? 0.75??????????????????????? Impact PartnersVoice: Finding added peace of mind in retirement????? 0??????????? 2?????? 0??? 1?? 2
? sadness surprise trust negative positive
1?????? 0??????? 0???? 1??????? 0??????? 1
2?????? 0??????? 0???? 1??????? 0??????? 1
3?????? 0??????? 0???? 0??????? 0??????? 0
4 ??????0??????? 0???? 0??????? 0??????? 1
6?????? 1??????? 0???? 2??????? 1??????? 2
> # Getting negative tweets and associated scores
> negative.tweets <- sentiment.tweets[which(sentiment.tweets$sentiment.score < 0),]
> head(negative.tweets, n=5)
?? sentiment.score???????????????????????????????????????????????????????????????? dataframe.tweets2 anger anticipation disgust fear joy
20?????????? -0.50?????????????? NASA scientists have discovered the building blocks of life on Mars???? 0??????????? 0????? ?0??? 0?? 0
25?????????? -0.75????????????????????????????? What to do if your social life is making you broke:????? 0??????????? 0?????? 0??? 1?? 0
33?????????? -1.75 Ransomware infection in Atlanta lost years of police evidence, photos, and videos???? 0 ???????????0?????? 0??? 2?? 0
35?????????? -0.75????????????????????????? How recalled memory triggers physical stress responses:????? 0??????????? 0?????? 0??? 0?? 0
54?????????? -1.25???????????????????????? A whale died of starvation after eating 80 plastic bags:????? 0??????????? 0?????? 0??? 1?? 0
?? sadness surprise trust negative positive
20?????? 0??????? 0???? 0??????? 0??????? 1
25?????? 1??????? 0???? 0??????? 1??????? 0
33?????? 1??????? 0???? 1??????? 2??????? 1
35?????? 0??????? 0???? 0??? ????1??????? 0
54?????? 1??????? 0???? 0??????? 1??????? 0
> # Getting neutral tweets and associated scores
> neutral.tweets <- sentiment.tweets[which(sentiment.tweets$sentiment.score == 0),]
> head(neutral.tweets, n=5)
?? sentiment.score?????????????????????????????????????????????????????????????????? dataframe.tweets2 anger anticipation disgust fear joy
5??????????????? 0???????????????????????????????????????????????????????????????????????????????????????? 0???????? ???0?????? 0??? 0?? 0
8??????????????? 0 BraintreeVoice: How Braintree Extend takes the headache out of contextual commerce????? 0??????????? 0?????? 0??? 0?? 0
9??????????????? 0??????????????????????????????? 5 financial myths, and what you should know instead:???? 0??????????? 0?????? 0??? 0?? 0
11?????????????? 0???????????????????????????????????????????????????????????????????????????????????????? 0??????????? 0?????? 0??? 0?? 0
12?????????????? 0?????????????????????????????????????????????????????? ??????????????????????????????????0??????????? 0?????? 0??? 0?? 0
?? sadness surprise trust negative positive
5??????? 0??????? 0???? 0??????? 0??????? 0
8??????? 0??????? 0???? 1??????? 1??????? 1
9??????? 0??????? 0???? 0??????? 0??????? 0
11?????? 0??? ????0???? 0??????? 0??????? 0
12?????? 0??????? 0???? 0??????? 0??????? 0
Now, if we want to get the most positive or the most negative tweet and associated score:
> max.positive = sentiment.tweets[which.max(sentiment.tweets$sentiment.score),]
> max.positive
??? sentiment.score?????????????????????????????? dataframe.tweets2 anger anticipation disgust fear joy sadness surprise trust negative
147??????????? 2.15 3 easy methods for building credibility at work???? 0??????????? 0?????? 0??? 0?? 0?????? 0?? ?????0???? 1??????? 0
??? positive
147??????? 2
> max.negative <- sentiment.tweets[which.min(sentiment.tweets$sentiment.score),]
> max.negative
?? sentiment.score???????????????????????????????????????????????????????????????????????????????????? dataframe.tweets2 anger anticipation
78??????????? -1.9 Bill and Melinda Gates' nonprofit aims to develop medicines for malaria, tuberculosis, and diarrhea…????? 0??????????? 1
?? disgust fear joy sadness surprise trust negative positive
78?????? 1??? 1?? 0?????? 1??????? 0???? 0??????? 1??????? 1
I’m sure after going through this article you will easily be able to carry out a sentiment analysis on your or any other Twitter handle. Doing a sentiment analysis and then combining the analysis with tweets popularity, you can easily find what are the kinds of tweets which resonate well with customers. You can tweak your Twitter strategy accordingly.
Happy to hear your experience with sentiment analysis and any other method that you may have worked on.