Using Natural Language Toolkit (NLTK) in python to analyze text sentiments
Varun Lobo
Data Scientist | Automotive Engineering | Analytics | Agile | Python | SQL | Data Science
The Natural Language Toolkit or more commonly called as the NLTK is a collection of libraries that allow easy manipulation and interpretation of the human language. This library is written for the python programming language.
There are primarily two widely used packages available to analyze text for sentiments. The VADER & TextBlob. VADER (Valence Aware Dictionary for sEntiment Reasoning) is sensitive to both polarity (positive/negative) and intensity (strength) of emotion, and is rule/lexicon based. On the other hand, TextBlob is also lexicon-based approach, and sentiment is defined by its semantic orientation and the intensity of each word in the sentence. A predefined dictionary of words which are classified as positive and negative and weighted in a sentence. TextBlob returns the polarity of a text between [-1, 1].
A critical difference between TextBlob and VADER is that?VADER is focused on social media. Therefore, VADER puts a lot of effort into identifying the sentiments of content that typically appear on social media, such as emojis, repetitive words, punctuations, slangs , etc.
In this tutorial, I have demonstrated how to analyze text and identify its sentiment using TextBlob library in Python.
Step 1 is to import the library into a working environment. Make sure you install the TextBlob and NLTK package either using pip or conda.
from textblob import TextBlob
Step 2 is to create an instance of the class Textblob by passing an argument in the form of a string.
blob = TextBlob(text)
Step 3 is to use the method sentiment.polarity on the class instance to output the sentiment.
sentiment = blob.sentiment.polarity
print(sentiment)
You can find the full code in my GitHub repo (Link).