Data Analysis with an LLM Twist
Data Analysis with an LLM Twist

Data Analysis with an LLM Twist

In mid 2023 I wrote a series of blog posts exploring how ChatGPT and Bard could be used to analyse tabular datasets.

I started with some trivial tables and moved on to more complex and larger datasets

  • With ChatGPT 4 I could copy paste the table and get some meaningful insight back
  • Bard was hit and miss. It would sometimes give a great response but would typically just make up some random answer that bore no resemblance to the data.

I then tried asking ChatGPT to create some Python code to process some of the larger tables I was looking at. This worked quite well and I was able to get some reasonable insights back from some public datasets. I even managed to get a Kaggle dataset analysed and submitted an OK entry.

Then came along ChatGPT Data Analyst ( aka Code Interpreter). This enabled me to upload a dataset and analyse it in within the confines of a single chat session. No need for me to copy paste python code. However it’s a constrained environment with no internet access and only a limited (but sufficient) set of libraries.

Plugins opened up a new world of possibilities and I was soon hooked on, a hosted Jupyter Notebook service that could be remotely controlled by ChatGPT. This was great, I had access to any library and also internet access to download datasets. I was able to do many great experiments. Unfortunately the business model of didn’t stack up and the service closed down at the end of 2023.

Roll forward 8 months to today. We now have:

  • Google Gemini
  • Bing Copilot
  • Anthropic Claude v3
  • ChatGPT GPTs

It’s time to revisit some of my posts and see how the latest offerings stack up against each other.

I’ll start with the first data analysis that I tried, seeing how well ChatGPT could help me help my son with his homework.

Telemetry from a Crash-Test Experiment

My first experiment was to analyse some telemetry data from my son’s crash-test engineering experiment. Force sampled every 20ms for 2 seconds.

scatter chart by excel (Force(N) by Time(s))

No context was given in the prompt, only the column headers and only the text data.

analyse this data and provide some insight about your findings
Time (s) B2	Force (N) B2
-0.008	0.01
-0.006	0.01
-0.004	0
1.918	0.35
1.92	0.3
1.922	0.26

My Observations

In this experiment, I put various state-of-the-art language models to the test, challenging them with the task of analysing and interpreting a simple dataset.

  • ChatGPT 4 (2024) emerges as the clear frontrunner, delivering the most accurate and insightful response to the data analysis task. Its performance showcases improvement over the August 2023 version, underscoring the rapid advancements in language model capabilities.
  • Claude's response is highly promising, especially considering its primary focus on text-based processing. Its ability to deftly handle tabular data analysis is both surprising and encouraging, hinting at its versatile potential.
  • Gemini and Bing Copilot, on the other hand, appear to be lagging behind in their readiness for data analytics tasks, as evidenced by their responses and ability to process data.

Moving forward, ChatGPT 4 seems to be the most viable choice for further data analytics experiments, particularly involving larger and more complex datasets. Its demonstrated proficiency in this domain instills confidence in its ability to tackle increasingly challenging data analysis scenarios.

Concurrently, Claude's promising performance warrants further exploration of its capabilities, positioning it as a potential alternative or complementary tool for specific data analysis tasks.

The insights gleaned from this exercise underscore the rapid evolution of language models and their growing prowess in areas beyond traditional text processing, paving the way for their expanded utilisation in data-driven domains.

If you're interested in the detail see below .....

Results from Aug 2023

ChatGPT 4

The data you provided appears to be a time series of force measurements. Here are some initial observations:

  1. The force initially starts at a very low value, then rapidly increases to a peak around 0.01 seconds, reaching a maximum of approximately 29.41 N.
  2. After this peak, the force decreases and stabilizes to a low value, with minor fluctuations around zero.
  3. Around 0.96 seconds, there is another rapid increase in force, reaching a peak of approximately 3.62 N at around 1.442 seconds.
  4. After this second peak, the force again decreases and stabilizes to a low value, with minor fluctuations around zero.

To provide more detailed insights, it would be helpful to know the context of these measurements. For example, are these forces being applied to a specific object? Is this part of a mechanical test or a physical experiment? The context can greatly influence the interpretation of the data.


The data shows that the force increases steadily until it reaches a maximum of 29.41 N at 1.4 seconds. After that, the force decreases steadily until it reaches 0 N at 1.5 seconds.

The data can be interpreted in a number of ways. One interpretation is that the force is caused by the acceleration of the object. As the object accelerates, the force increases. When the object reaches its maximum velocity, the force reaches its maximum value. After that, the object begins to decelerate, and the force decreases.

Another interpretation is that the force is caused by the friction between the object and the surface it is moving on. As the object moves faster, the friction increases. When the object reaches its maximum velocity, the friction reaches its maximum value. After that, the object begins to decelerate, and the friction decreases.

Results from March 2024

ChatGPT 3.5

To analyze the provided data, we have time (in seconds) and force (in Newtons) readings. Let's start by visualizing the data to understand the relationship between time and force.

Here's a plot of force against time:

Looking at the plot, we can observe the following insights:

  1. Initial Stability: At the beginning (around time 0 to 0.02 seconds), the force is relatively stable around 0 N, with minor fluctuations.
  2. Force Buildup: As time progresses, there's a gradual increase in force, starting from around 0.02 seconds and reaching a peak around 1 second.
  3. Peak Force: The force peaks at approximately 1.1 seconds, reaching around 29.41 N.
  4. Steady Decline: After the peak, there's a steady decline in force until approximately 1.4 seconds.
  5. Fluctuations: Beyond 1.4 seconds, the force fluctuates around a lower magnitude, indicating a relatively stable system.

Based on this analysis, it seems like there's a dynamic process occurring, possibly a physical experiment or system undergoing changes over time. Further analysis or context about the experiment or system would be needed for a more detailed interpretation.

ChatGPT 4

Analysing the provided data, which records the force (in newtons) experienced by an object over time (in seconds), reveals a distinct pattern and several key phases throughout the experiment or observation. Here are the insights drawn from the analysis:

  1. Initial Equilibrium (Pre-impact Phase): The dataset begins with negative time values, indicating measurements taken before a specific event or impact. The force values are near zero, suggesting that the object was in a state of equilibrium or experiencing negligible force.
  2. Sudden Increase in Force (Impact Phase): At time zero, there is a sudden and sharp increase in force, indicating the moment of impact or the start of the force application. The force increases rapidly from nearly zero to a peak value of 29.41 N within 0.01 seconds. This suggests a high-intensity event, like a collision or a sudden application of force.
  3. Oscillations and Damping (Post-impact Phase): Following the peak, the force values oscillate, gradually decreasing in magnitude. This behaviour is indicative of a system undergoing damping, where the oscillations (fluctuations in force) diminish over time due to energy dissipation mechanisms like friction or air resistance. The force values show a pattern of spikes and drops, eventually stabilising back to near-zero values, which suggests the system is returning to equilibrium.
  4. Secondary Increase in Force: Notably, there are two significant secondary increases in force after the initial impact phase. The first starts around 0.962 seconds with a gradual increase, peaking at 9.85 N at 0.98 seconds. The second begins around 1.692 seconds, with a more gradual increase and smaller peak forces, indicating a lesser but still noticeable event or series of events affecting the system.
  5. Return to Near-zero Force Values: Towards the end of the dataset, force values approach zero, indicating the cessation of significant external forces acting on the object and the system's return to a state of equilibrium.
  6. Negative Force Values: There are instances of negative force values, which could indicate measurement errors, noise in the data, or actual instances where the force applied in the direction opposite to the defined positive direction. These instances are relatively small in magnitude and brief in duration.

In conclusion, the data illustrates a system experiencing a high-intensity impact followed by a series of oscillations as it returns to equilibrium. The presence of secondary increases in force suggests additional events or interactions affecting the system post the initial impact. The eventual return to near-zero force values indicates the system stabilising after experiencing the series of forces.


Oh Dear, at least Bard could handle this.

Oh Dear, at least Bard could handle this.

I'm still learning how to answer this question. In the meantime, try Google Search.

Bing Copilot

The 4000 character limit truncates the data at 0.73s into the experiment

The data you provided appears to be a time series of force measurements. Here are some observations:

  1. The force starts at a very low value, increases rapidly to a peak, and then decreases gradually. This could suggest an event like an impact or collision, where the force is initially small, quickly reaches a maximum when the impact occurs, and then dissipates over time.
  2. The maximum force recorded is 29.41 N at 0.01 s.
  3. After the peak, the force decreases to a value close to zero and fluctuates around zero for the rest of the time series. This could indicate that the force has been effectively absorbed or dissipated after the initial impact.
  4. There are some minor fluctuations in the force after the initial event, but these are relatively small compared to the peak force.
  5. The data ends abruptly at 0.738 s, so we don’t have information about the force behavior after this point.

Without additional context, it’s hard to provide more specific insights. If this data is from a specific experiment or event, knowing more about the context could help interpret the data more accurately. For example, if this data is from a crash test, the peak force would be the maximum impact force experienced by the vehicle, and the subsequent decrease would represent the dissipation of this force. The fluctuations around zero could then represent minor adjustments or movements after the main impact.

Anthropic Claude V3 Sonnet

Based on the provided data, here are some insights:

  1. The data appears to show a periodic or oscillatory behavior, with the force values rising and falling in a repeating pattern.
  2. The maximum force value recorded is around 30 N, which occurs at around 0.01 seconds.
  3. The period of the oscillation, as estimated from the time between consecutive peaks or troughs, is approximately 0.2 seconds.
  4. The amplitude of the oscillation, which is the difference between the maximum and minimum force values, decreases over time, indicating that the oscillations are damped.
  5. The oscillations appear to be centered around a mean force value of approximately 0 N, as the positive and negative force values are roughly symmetrical about the zero line.
  6. There are some irregularities in the data, such as small deviations from the smooth oscillatory pattern, which could be due to measurement noise or other factors.
  7. The data suggests that the phenomenon being measured involves a periodically varying force or motion, which could be related to vibrations, oscillations, or cyclical processes in the system under study.

To further analyze the data and gain more insights, it would be helpful to have additional context about the experimental setup, the physical system being measured, and the units of the time and force values. Plotting the data could also provide a clearer visual representation of the oscillatory behavior.

Appendix: Self Rated Observations

For fun, I asked both Claude and ChatGPT 4 to critically review the responses.


I will critically analyze the responses from the different language models (LLMs) to the provided telemetry data from a crash-test experiment.


ChatGPT 4:

  • Provides a provides a good high-level overview of the data, describing the initial low force values, the rapid increase to the peak force, the decrease and stabilization after the peak, and the second smaller peak around 0.96 seconds.
  • However, it doesn't go into much detail in interpreting the patterns or speculating on the underlying causes
  • It rightly notes that more context would be needed for a deeper interpretation.
  • This response demonstrates a solid grasp of the different phases in the force measurements. The main limitation is not providing a more physical interpretation of what might be causing this force behavior


  • Provides a basic description of the overall force trend, mentioning the increase, peak, and subsequent decrease.
  • Offers two possible interpretations: acceleration/deceleration of an object or friction between an object and a surface.
  • However, the analysis is quite superficial and misses many important details present in the data, like the oscillations, secondary force increases, and stabilization patterns.


ChatGPT 3.5:

  • Visualizes the data effectively through a plot, which aids in identifying patterns.
  • Provides a good analysis of the key phases, including the initial stability, force buildup, peak force, steady decline, and fluctuations towards the end.
  • Correctly identifies the peak force value and the overall dynamic process represented by the data.
  • Acknowledges the need for additional context or information about the experiment/system to provide a more detailed interpretation.
  • A solid analysis, though not as comprehensive as ChatGPT 4's response.

ChatGPT 4:

  • Provides a detailed and accurate analysis of the different phases observed in the data, including the initial equilibrium, sudden force increase, oscillations and damping, secondary force increases, and return to equilibrium.
  • Correctly identifies key features like the peak force value, oscillation patterns, and the presence of measurement noise or negative force values.
  • Offers a plausible interpretation of the data as representing a system experiencing a high-intensity impact or force application followed by dampened oscillations and stabilization.
  • Overall, an excellent analysis that covers most salient aspects of the data.


  • Does not provide any analysis of the data.
  • Instead, it acknowledges its inability to answer the question and suggests using a search engine.
  • Gemini's response is not satisfactory as it does not provide any analysis of the data and instead suggests using a search engine, which is not helpful for the given task.

Bing Copilot:

  • Provides a reasonable analysis of the initial part of the data, including the low starting force, rapid increase to a peak, and subsequent decrease and fluctuations around zero.
  • Correctly identifies the maximum force value and the potential dissipation or absorption of force after the initial impact.
  • However, the analysis is cut off due to the character limit, so it misses the latter part of the data where secondary force increases and stabilization patterns are observed.
  • Acknowledges the need for additional context to provide more accurate insights.

Anthropic Claude V3 Sonnet:

  • Accurately identifies the oscillatory or periodic behavior exhibited by the force data.
  • Provides good estimates of key parameters like the maximum force value, oscillation period, and damping of the amplitude over time.
  • Correctly observes the symmetry of the oscillations around a mean force value of zero.
  • Acknowledges the presence of irregularities or noise in the data.
  • Suggests that the data could be related to vibrations, oscillations, or cyclical processes in the system under study.
  • However, the analysis does not cover the distinct phases or events represented by the data, such as the initial impact, secondary force increases, and stabilization patterns.
  • Acknowledges the need for additional context about the experimental setup and physical system to gain further insights.

In summary, ChatGPT 4 provided the most comprehensive and accurate analysis, followed closely by ChatGPT 3.5 and Claude V3 Sonnet. Bard and Bing Copilot provided more surface-level analyses, while Gemini did not attempt an analysis. The responses highlight the varying capabilities of different LLMs in interpreting and deriving insights from complex data patterns.

ChatGPT 4

Analyzing the responses from various language models (LLMs) on the telemetry data from a crash-test experiment reveals a range of methodologies and depth in the analysis. Each LLM's response reflects its unique approach to processing and interpreting the data, showcasing differences in analytical capabilities, interpretation of data points, and contextual understanding. Below is a critical analysis of the responses:

ChatGPT 4 (2023)

  • Methodology: Employs a descriptive analysis focused on identifying patterns in the data, such as initial low values, peaks, and subsequent stabilizations. It acknowledges the need for contextual information to deepen the interpretation.
  • Strengths: Provides a clear, step-by-step breakdown of the data's patterns. Recognizes the limitation of lacking context for a comprehensive analysis.
  • Limitations: Lacks visualization, which could enhance understanding. Its interpretation is somewhat generic without delving into possible causes or implications of observed patterns.


  • Methodology: Offers two different interpretations based on potential causes (acceleration and friction). It simplifies the data into a linear progression of events.
  • Strengths: Introduces conceptual frameworks to interpret the data, aiding in theoretical understanding. Attempts to provide explanations beyond mere observation.
  • Limitations: The analysis seems overly simplified and may misinterpret the data, especially considering the complexity of crash-test dynamics. It incorrectly summarizes the data timeline and peak force values.

ChatGPT 3.5

  • Methodology: Suggests visualizing the data to identify relationships between time and force. It outlines observed insights based on an assumed plot.
  • Strengths: Advocates for data visualization, which can aid in comprehension. Highlights specific phases within the dataset.
  • Limitations: The response is purely hypothetical since no actual plot is provided, limiting its utility. Like others, it lacks contextual exploration of the phenomena causing the observed patterns.

ChatGPT 4 (2024)

  • Methodology: Provides a detailed, phase-wise analysis, recognizing initial equilibrium, impact, oscillations, and secondary impacts. It includes the notion of damping and system stabilization.
  • Strengths: Offers a comprehensive and nuanced interpretation, considering different phases of the experiment. Mentions the need for context to fully understand the data.
  • Limitations: Despite the depth, it does not offer visualizations or suggest specific models or theories that could explain the observed patterns more concretely.


  • Methodology: Admits inability to analyze the data, suggesting reliance on external search.
  • Strengths: N/A due to lack of analysis.
  • Limitations: Inability to provide any insights or analysis, reflecting limitations in data handling or specific domain knowledge.

Bing Copilot

  • Methodology: Observes patterns and suggests possible interpretations, such as an impact event, based on force dynamics.
  • Strengths: Identifies a clear narrative in the data, suggesting an event like an impact or collision and subsequent dissipation.
  • Limitations: The analysis ends abruptly due to character limit constraints, which may omit significant insights. Like others, it lacks a deeper exploration of underlying causes.

Anthropic Claude V3 Sonnet

  • Methodology: Focuses on periodic or oscillatory behavior, suggesting a detailed pattern analysis.
  • Strengths: Provides a unique perspective by identifying oscillatory behavior and damping, which adds a different dimension to the analysis.
  • Limitations: The interpretation seems misaligned with the context of crash-test data, potentially misreading the dataset's nature.

Conclusion: The range of responses from different LLMs illustrates varying levels of depth, contextual understanding, and analytical capabilities. Models that offer detailed phase-wise analyses or incorporate theoretical frameworks provide more insightful interpretations. However, most models struggle with contextualization and the application of specific scientific principles or theories that could offer a deeper understanding of the observed phenomena. The lack of data visualization across all responses is a notable gap, considering its importance in data analysis.

Cindy McClung

??"Suggested Term" Optimization for Home Care/Health |??Sculpting Success With Fully Automated Marketing Process |??200+ businesses auto-suggested by Google | ???Effortlessly get online customer reviews | ??Near Me

1 年

Can't wait to dive into those insights! ??

Reza Farahani

Building in Biotech and AI | Hiring across roles!

1 年

Can't wait to dive into the insightful reflections on the evolution of data analysis with LLMs! ??



Jan Varga的更多文章

  • Slack Smarter: Knowledge from Chat

    Slack Smarter: Knowledge from Chat

    Building on the idea of making knowledge sharing easier for engineers, as discussed in my previous article - How to Get…

  • How to Get Your Engineers Engaged in Knowledge Sharing

    How to Get Your Engineers Engaged in Knowledge Sharing

    If you’ve ever tried to encourage engineers to share knowledge, you know it’s not easy. In theory, everyone benefits…

    1 条评论
  • Engineering Reimagined: A GenAI Roadmap for a Future of Innovation

    Engineering Reimagined: A GenAI Roadmap for a Future of Innovation

    Laying the Groundwork for a Revolution: Building Your GenAI Foundation with the Right Tools Before we can unlock the…

    2 条评论
  • Exploring Smol Agents: Building an Intelligent Shopping List Assistant

    Exploring Smol Agents: Building an Intelligent Shopping List Assistant

    Introduction The world of AI development is experiencing a fascinating shift toward more lightweight, specialized tools…

    1 条评论
  • Reimagining Banking: A Glimpse into the Future with Generative AI

    Reimagining Banking: A Glimpse into the Future with Generative AI

    Imagine a world where your bank understands you like a close friend, anticipates your needs before you even voice them,…

  • Coding Tests Are Irrelevant: Why It’s Time for a New Approach

    Coding Tests Are Irrelevant: Why It’s Time for a New Approach

    The traditional coding test, once a hallmark of technical interviews, is quickly losing its relevance in today’s…

    4 条评论
  • Command Line Rules: A Nostalgic Rant

    Command Line Rules: A Nostalgic Rant

    Back in the day, it was just you, your terminal, and a handful of scripts that got the job done. A time when control…

  • The Grand Compendium

    The Grand Compendium

    Over the last few months I've posted almost 60 articles across a variety of topics. I've spent the last week organising…

    1 条评论
  • AI in Banking

    AI in Banking

    A consolidated list of my articles on AI in Banking Over the last few months I've posted almost 60 articles across a…

    1 条评论
  • GenAI for Data Analytics

    GenAI for Data Analytics

    A consolidated list of my articles on GenAI for Data Analytics Over the last few months I've posted almost 60 articles…

    2 条评论

