How non-programmers can use Chatgpt’s Code Interpreter to kickstart analysis
This article was first published on Capitalmind
By now everyone has heard about?chatgpt, and most likely other publicly available Large Language Models (LLMs) like Anthromorphic’s?claude, Google’s?Bard?and Microsoft’s?Bing Chat(built on chatgpt).
Predictably there are hundreds of takes about the impact of AI on work and life in general. Depending on where you fall on the pessimism-optimism spectrum, this can range all the way from “AI will initiate ‘Skynet’ type human annihilation” to “AI will find a cure for every type of cancer”.
I’ll leave the blue-sky visioning of AI’s potential impacts to the people most qualified to do it i.e. management consulting firms and think tanks.
But I can say this with certainty, AI has applications for every professional, no matter what their field. Anyone who works at a computer can potentially harness AI to get more effective at their job.
“I really worry that people are not taking this seriously enough … this fundamentally is going to be a shift in how we work and how we interact at a level that’s as big as anything we’ve seen in our lifetimes.”?
Ethan Mollick (Associate Professor, Wharton Business School)
GPT 4 can, given the right inputs, write python code to analyse and visualise data. You give it inputs about what you want the code to do, and chatgpt writes the code for you to copy, paste and run in a python environment. This alone has been a game-changer for our team looking to do repeatable analysis on large datasets that MS Excel would struggle with. But this workflow has its challenges. It can take a few iterations to get the instructions right to get working code.
Enter Code Interpreter. Code Interpreter is a native python environment within chatgpt that can take uploaded files, read and analyse them. By the way, the name ‘Code Interpreter’ is misleading because it doesn’t interpret code fed to it but writes and executes code given simple English-language prompts.
Here’s a simple example of what Code Interpreter can do:
I downloaded the historical Nifty TRI closing values since inception from niftyindices.com and uploaded it to chatgpt-4 checking the ‘Code Interpreter’ option. What follows are the prompts (formatted as quotes) and chatgpt’s responses. Note, my inputs are only the quoted text prefixed with the word “Prompt:”, all other text are verbatim responses from chatgpt.
Prompt:?attached file has closing index values. convert the ‘Date’ column to datetime and sort ascending. Then please come up with insights from the data. Use charts where appropriate.
The data provided contains the following columns:
Next, I’ll perform some basic exploratory data analysis, create some visualizations, and generate insights based on this data.
The chart illustrates the historical performance of the Nifty TRI (Total Return Index) over time, from June 1999 to the most recent data point.
The line chart shows a generally upward trend, indicating that the Nifty TRI has increased over this period. This suggests that the overall market, as represented by this index, has grown during this time.
However, it’s important to note the presence of some significant fluctuations and volatile periods. These likely correspond to various economic events and market conditions. Let’s examine the data more closely to identify periods of significant growth and decline.
To do this, I’ll calculate the year-over-year returns for the Nifty TRI. This will provide a clearer picture of how the index has performed on an annual basis.
This chart shows the year-over-year (YoY) return of the Nifty TRI, which provides a more granulated view of the index’s performance.
A few key observations can be made:
These insights provide an understanding of the index’s historical performance and its susceptibility to broader market and economic events. As always, past performance is not indicative of future results, but it can provide a useful context for understanding the possible risks and returns associated with this index.
Prompt:?are there other charts you can make to help understand the data better and glean more insight?
Absolutely, we can create additional charts that may provide further insights into the data.
Let’s start by calculating and plotting the 1-year rolling return.
This chart shows the 1-year rolling return of the Nifty TRI. The rolling return is the return an investor would have earned if they had invested exactly 1 year prior to any given date.
Key insights from this chart include:
Next, let’s move on to creating a monthly returns heatmap. This will require calculating the monthly returns first.
This heatmap shows the monthly returns of the Nifty TRI for each month of each year. The color coding makes it easy to identify periods of high (green) and low (red) returns.
Key insights from this chart include:
Finally, let’s create a histogram to visualize the distribution of the monthly returns. This will provide insight into the frequency of certain levels of return.
This histogram shows the distribution of monthly returns for the Nifty TRI. The line represents the kernel density estimation (KDE), which estimates the probability density function of the variable.
Key insights from this chart include:
These various charts provide a multifaceted view of the data, each offering unique insights. From understanding the overall trend to detecting volatility, observing seasonality, and discerning the distribution of returns, we can derive a comprehensive understanding of the Nifty TRI’s historical performance.
Prompt:?any more insights from the data?
Sure, here are a few more analyses we can do:
Let’s start with calculating descriptive statistics.
Here are the descriptive statistics for the Nifty TRI and the Monthly Returns:
Key insights from these statistics include:
Next, let’s calculate the maximum drawdown and the Compound Annual Growth Rate (CAGR).
Here are the additional insights:
In summary, this data analysis provides several insights into the Nifty TRI’s historical performance. Understanding these aspects, such as the overall trend, volatility, monthly and yearly returns, return distribution, maximum drawdown, and CAGR, can help investors make informed decisions about their investment strategies.
Notice above that Code Interpreter seems to have made an error with the drawdown calculation “maximum drawdown for the the Nifty TRI is approximately 5.28” which makes no sense. Its important to keep in mind that chatgpt, or any LLM makes mistakes, sometimes silly ones, so always check its work before using further analysis.
领英推荐
Prompt:?here’s the file again with index closing values. please use this to redo the drawdown calculation and identify the worst drawdown. also plot a graph visually representing drawdowns
The maximum drawdown for the Nifty TRI is approximately -0.595, or -59.5%. This means that from its peak, the index has seen a reduction to roughly 40.5% of its peak value at its worst point.
Now, let’s plot a graph visually representing drawdowns. This graph will provide a visual representation of the depth, duration, and recovery of each drawdown period.
I apologize for the oversight; it seems the?matplotlib.pyplot?library wasn’t imported due to the previous execution reset. Let’s import the necessary libraries and try the plot again.
This chart shows the drawdowns over time for the Nifty TRI. Drawdowns are represented by the drops below the 0% line. The shaded red area visually represents the extent and duration of each drawdown.
Key insights from this chart include:
The most severe drawdown appears to occur during the 2007-2009 period, which corresponds to the global financial crisis. During this time, the market experienced significant declines, which is reflected in the depth of the drawdown during this period.
Remember, a drawdown is a peak-to-trough decline during a specific recorded period of an investment. A drawdown is usually quoted as the percentage between the peak and the trough.
At this point, let’s say we’d like to dive deeper into the analysis of drawdowns, their frequency, severity and time taken to recover.
Prompt:?here’s a file with daily index closing values. Convert the date column to datetime and sort ascending. then do some analysis about drawdowns in the nifty. Mainly, how often does the nifty fall more than 10% from its peak. And when it does, how much more does it typically fall, and how long does it take to recover from its lowest point. Please represent the analysis in tabular form and graphical form where appropriate
Alright, let’s break down the tasks:
Let’s start with the first task.
Finished working
Show work
The data has been successfully sorted in ascending order based on the date.
Next, we’ll proceed with analyzing the drawdowns in the Nifty. Here’s the plan:
Let’s start by computing the running maximum and the drawdown.
Finished working
Show work
We’ve computed the running maximum and the percentage drawdown for each day.
Next, we’ll identify periods where the drawdown exceeds 10% and compute the following for each such period:
To do this, we’ll define a drawdown period as a continuous sequence of days where the index is below its previous peak. The start of a drawdown period is when the index falls below its previous peak, and it ends when the index surpasses this peak.
Let’s identify these drawdown periods and compute the metrics of interest.
Finished working
Show work
Here’s a tabular summary of the drawdowns in the Nifty where the index fell more than 10% from its peak:
Next, we’ll visually represent this data. We’ll plot the drawdowns over time with the depth of the drawdown (Max Drawdown %) on the y-axis and the Start Date on the x-axis. This will help in visually understanding the magnitude and frequency of significant drawdowns.
The bar chart illustrates the significant drawdowns in the Nifty where the index fell more than 10% from its peak. Each bar’s height indicates the maximum percentage drop from the peak during the respective drawdown period, and the position on the x-axis denotes the starting date of the drawdown.
From this visual representation, we can observe periods of significant drops, like the one around 2008 (which aligns with the global financial crisis) and the drop in early 2020 (corresponding to the pandemic’s start).
To address the remaining parts of your request, we’ll now compute:
Let’s calculate these metrics.
Finished working
Show work
Here are the calculated metrics:
In summary, while significant drawdowns of more than 10% are not frequent, when they do occur, the average further decline is around 23%. Recovery to previous peak levels typically takes about a year.
Would you like any further analysis or visualizations?
Prompt:?can you write out a table with each instance of the 17 instances shown in the chart, the maximum drawdown seen and the number of days to recover
I’ve created an image of the table detailing the significant drawdowns in the Nifty.
Would you like any further assistance?
Prompt:?is there a way to improve the look of the table to make it professional and readable
Certainly! Here are a few ways we can enhance the table’s appearance to make it look more professional and readable:
Let’s implement these enhancements.
–End of chatgpt Code Interpreter session–
This could go on, but you’ve got the general idea by now.
Notice how the prompts were not elaborate or detailed but fairly open-ended. Using more specific prompts allows for more elaborate analysis and outcomes. Code Interpreter writes the code, catches errors and figures out alternative paths.
Of course, this is a simplistic example to highlight what Code Interpreter can do given some data. Getting the right outcome requires some tweaking of the inputs, which typically comes down to writing more structured prompts.
But look at what it could do just in this simple session:
Having done these at various points using both Excel and code, my estimate is the time required for exploratory analysis reduces by 80-90%.
Not just that, Code Interpreter can unpack zip files containing multiple files, combine them and do further analysis. It can also read pdf files to summarise or look for data in them.
The closest analogy to using chatgpt and Code Interpreter is like everyone having access to a smart yet inexperienced intern conversant with Python programming. If your job requires any data and analysis, you should be using Code Interpreter to see how you can speed up and improve. Imagine, this is just the beginning!
Recommended reading on making use of AI:
The author is on twitter?@CalmInvestor
Just there you killed a few developer jobs with a $20/month subscription:-)