Sentiment Analysis: Exploring Reddit Posts for Selected Tickers
Mihai Tanase
Senior Professional Engineer and Data Analytics Enthusiast | Expert in Mechanical Engineering | Diverse Experience in Pharma, Infrastructure, and Renewable Energy Projects
Introduction and Objective
Launching a personal project to utilize my recently acquired Data Analytics skills, I'm venturing into financial data science. This project focuses on daily sentiment analysis of stock news, particularly on tickers relevant to my investment strategy. While these tickers may not be the primary choice of retail traders, I'm keen to understand their sentiments to better inform my market decisions.
Data Collection and Initial Analysis
To kickstart this endeavor, I utilized the Reddit API to extract data from influential subreddits like wallstreetbets, stocks, and StockMarket. This process began by examining 200 posts and 20 comments, later refined to 50 posts daily to ensure the relevance of the sentiment analysis. I concentrated on titles and selftexts to identify emerging trends or noteworthy topics.
Tools and Techniques
Using the Sentiment Intensity Analyzer and Word Cloud Library, I was able to pinpoint key sentiments and phrases, such as "stock" and "market". These terms frequently cropped up in discussions, revealing retail investors' speculations about market trends. For instance, in analyzing the DOCU ticker, there was a noticeable positive sentiment, with words like "market" and "higher" prevalent, indicating a sustained buying interest despite a recent price surge.
Further Insights with PRAW
To deepen the analysis, I employed PRAW, a Python wrapper for Reddit's API, to extract and analyze post comments. This additional layer of analysis offered more nuanced views on various stock tickers.
领英推è
Challenges and Solutions
Admittedly, the approach wasn't without its challenges. The analysis sometimes captured extensive texts that, while mentioning a specific ticker, were more relevant to others. Addressing this, I focused on refining the sentiment analysis to be more targeted towards the intended ticker.
Efficient use of SQLite helped overcome Jupyter Notebook's storage limits, and careful implementation of delays circumvented Reddit API's rate limits. I also tailored the sentiment analysis models to better fit the financial context and enhanced database efficiency through batch data insertion.
Conclusion and Disclaimer
Despite certain constraints, this method proved quite effective for an initial understanding of daily sentiment trends in the stock market. It's important to note that this analysis is not an investment recommendation but rather an educational exercise in applying data science to finance.
Details About the Author:
My passion lies in uncovering valuable insights into the finance sector through data science. This project is just one example of how data can be utilized to understand complex market dynamics.
For more details, check out the full project here: https://github.com/DataVoyagerMT/Sentiment-Analysis-Exploring-Reddit-Posts-for-Selected-Tickers.git
I'd love to hear your thoughts and feedback!
?? DISCLAIMER: This post is intended for research and learning purposes only and should not be construed as an investment recommendation.
#DataScience #SentimentAnalysis #Reddit #FinancialData #StockMarket #Python #DataAnalytics #RedditAnalysis #WordCloud #Finance#DOCU#WBA#TSLA