An overview of the combined power of Twitter and Python

Meghna Goswami

Manager - Cyber, Risk and Regulatory at PwC | MS-IS graduate - UT Arlington-College of Business | Engineer

发布日期: 2019年5月21日

A lot of us have heard about the fascinating and powerful visualizations and calculations that Python can do, but how many of us have actually seen it closely? This blog gives rare and useful insights into the combined power of Twitter and Python for Social Media Analytics…

The following are a few important steps used in Social Media Analytics using Twitter:

Data Collection :

Twython is an actively maintained, pure Python wrapper for the Twitter API. It supports both normal and streaming Twitter APIs. It is used to extract the required tweets by filtering them out based on keywords and time range.

Sentiment Analysis :

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more.

Matplotlib is a Python 2D plotting library which produces publication quality figures in a variety of hardcopy formats and interactive environments across platforms. It tries to make easy things easy and hard things possible. One can generate plots, histograms, power spectra, bar charts, error charts, scatterplots, etc., with just a few lines of code.

In the figures below, TextBlob in combination with Matplotlib is used to plot the polarity and subjectivity scores based on the corpus of tweets -

Word cloud :

NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning, wrappers for industrial-strength NLP libraries.

The figure below shows a Word Cloud created by removing stop words using NLTK package and then stemmed using the Porter Stemmer algorithm. The words of the tweets were then fed into the Word Cloud module -

Topic Modeling :

After removing stop-words and stemming, Non-negative Matrix Factorization (NMF) from Scikit-Learn and Latent Dirichlet Allocation (LDA) from GENSIM are used to conduct topic analysis.

References and further reading :

https://twython.readthedocs.io/en/latest/

https://matplotlib.org/

https://towardsdatascience.com/topic-modelling-in-python-with-nltk-and-gensim-4ef03213cd21

An overview of the combined power of Twitter and Python

Meghna Goswami

Manager - Cyber, Risk and Regulatory at PwC | MS-IS graduate - UT Arlington-College of Business | Engineer

The following are a few important steps used in Social Media Analytics using Twitter:

Data Collection :

Sentiment Analysis :

Word cloud :

Topic Modeling :

References and further reading :

社区洞察

其他会员也浏览了

Exploring Sentiment Analysis with Python: A Case Study

Evaluating Python #1

How can I learn artificial intelligence with a little bit of knowledge of Python?

Poor Things - Analyzing the Screenplay with Python, LLMs and NLP

Guide to Build Your AI Chatbot in Python and NLP

Python and Machine Learning: A Perfect Match for Data-Driven Innovation

Unleashing Insights with Sentiment Analysis: Analyzing Textual Data using Python

Langchain

Machine Learning with (Monty) Python

NLP(Natural language Processing)-Part 6