Data Analytics with LLMs and the Stock Market in the Mix

Data Analytics with LLMs and the Stock Market in the Mix

"Data-driven" "decision-making" - terms we often use, yet even when the data is available, we frequently fail to execute effectively. Having data and making sense of it are two entirely different challenges. First, meaningful data analysis requires knowledge and skills. Second, we need the right tools to process data efficiently and apply that knowledge.

While learning is a personal journey, one that some embrace more than others, a lot of organized effort has gone into building analytics tools to make data analysis accessible to everyone.?Yet, until recently, not only had we failed to achieve it, but the goal still seemed out of reach. Now, with LLMs in the mix, it feels much more tangible. However, remember: a fool with a tool is still a fool,?only making disaster happen faster, so better invest in learning! ??


TL;DR:

The result of my project is an?AI Agent?that enables you?to?freely query and visualize data from a spreadsheet. The Agent may look as if it's prepped for?open brain surgery, but it walks you through its thought process and it’s worth watching this video,?especially the execution part at the end:?https://www.youtube.com/watch?v=FcNIgxRYjVI?

As a bonus, you’ll also see how?US IPOs of 2024?have performed.


The BI Landscape Today

The?Business Intelligence (BI) domain?feels like a child that has lost sight of its guiding principle:?Making better business decisions through data.

Due to increasing complexity, BI has outsourced ETL/ELT processes, analysis, and visualization to those technically capable, leaving decision makers dependent on others. And in a world where?packaging matters more than content,?I've often seen priorities shift towards?dashboard branding and styling?rather than?insights. I’ve had a huge backlog of such "beauty" requests as a PM in this space!?And as for the rest of the data pipeline, it sometimes feels like it's trying to redefine the meaning of 'complexity'.

This experience, combined with my own need to analyse data independently, motivated me to explore what lies beyond conventional BI approaches. I hope many others are working on similar projects so that the?future of data analytics?is brighter than it currently appears. And while I am just a PM who might not go that far, I hope you will!


My Wishlist

When I started this project, I reflected on both my biggest and smallest wishes, deciding to start with something feasible. I don’t know if I’ll reach the finish line on my own, but perhaps one day, I’ll collaborate with like minded people to push this further.?

To keep it brief, here’s what I wrote down and started building:

  1. Support freeform querying within?a single structured data source?(e.g., a database or a spreadsheet).
  2. Add visualizations.
  3. Allow querying and visualization of?structured data across multiple sources?(e.g., databases, CSV, or Excel sheets).
  4. Enable?freeform querying and visualization?of unstructured data.
  5. Eventually combining structured and unstructured data sources.


Having grown organically, this?AI Agent?is the result of my efforts so far. It is based on the?ReAct pattern?and implemented in?LangChain. However, it?needs to be reimplemented in LangGraph.?I am not a developer, so my attempt is far from perfect!

GitHub Repo: https://github.com/jenyss/DataAnalyticsAIAgent


Agent Tools

The choice of tools was made after some research, and while I won’t elaborate on the details here, I will say that?Pandas and DuckDB complement each other, enabling the solution to address a broader range of use cases.

  • Preview Excel Structure?– Scans?the spreadsheet to identify column names and data types.
  • Simple Dataframe Query?– Executes basic queries to filter, sort, and retrieve specific data from the spreadsheet.
  • Complex DuckDB Query?– Processes complex queries using SQL to perform aggregations, calculations, and in-depth data analysis.
  • Create Visualization?– Generates charts and graphs to represent data visually, making trends and patterns easier to interpret.


The Interesting Part

The spreadsheet the Agent works with contains financial data related to the US IPOs of 2024 which I wanted to analyse.?

Below are some examples?of the questions I asked.?If you have any questions, let me know. I will be rewriting this Agent so stay tuned for a better version.

Draw performance by industry. You need to group the tickers by industry and calculate the average performance for each ticker in the group and then from there calculate the avg performance for the industry. where the average performance must be calculated only from the columns month 1 through month 13.         


Average IPO performance by Industry



Top performing Industry (IPOs only)



3rd performing Industry (IPOs only)


Worst performing Industry (IPOs only)
Draw a distribution of all ticker symbols based on proposed share price and avg performance, where the average performance must be calculated only from the columns month 1 through month 13.         

Pay attention! This visualization represents just a snapshot in time. It does not indicate whether you would have gained or lost if you had invested in any of these stocks, that depends entirely on when you bought and sold. The cutoff date was mid January 2025. The min proposedSharePrice was 4.00 USD, with one exception - IMPACT BIOMEDICAL INC. (IBO) starting at 3.00 USD.

Avg performance of a ticker correlated to its proposed share price (think initial IPO price)


Below you can see where Reddit scored.


Reddit



Stefan Heil

Enterprise Architect at REWE

3 周

Nicely done and interesting insights

回复

要查看或添加评论,请登录

Jenya Stoeva的更多文章

社区洞察

其他会员也浏览了