Using ChatGPT for Data Analysis #2

Using ChatGPT for Data Analysis #2

I had written an earlier post about using ChatGPT for Data Analysis where I was trying to rationalize the probabilistic nature of ChatGPT (I get different answers when I ask the same question) with the deterministic nature of data analysis (Average of 4+6 = 5. Doesn't matter how many time I ask this question).

In that post I had broken down how data analysis is done in ChatGPT Data Analyst and how it is leveraging python for the deterministic part. That being said, the probabilistic nature of GenAI can introduce errors and In this post, I wanted to take an example and see the results for myself so I can understand the guardrails.

I googled for sample data and got this link to some sales data. I downloaded this into excel and asked my first question -

Please review the attached excel and explain what this data set is about

The first response was -

Response 1

When I asked the question again the response was -

Response 2

This is a good example of how ChatGPT combines deterministic responses with probabilistic responses. The column names (deterministic - python) did not change (as I would hope) but all the text around it (probabilistic - GenAI) while similar changes.

Then I asked it a 3rd time and this was the result -

Response 3

Overall, I am ok with these responses. While all 3 are different, they are accurate and I get the point. That being said, the one place where error can creep in is in the descriptions of each field name. My data set did not have any descriptors of the field names.

Rep - "The sales representative responsible for the sale is recorded in this column. This information can be used to evaluate the performance of individual sales reps and to provide targeted feedback or incentives."

While this is probably accurate, one can see how a hallucination could be a problem here. Those of us who have worked at enterprises know how field names aren't always the most descriptive.

Then I asked it - "what is the sales per rep per year with year on columns and rep on rows"

These were the responses -

Response 1


Response 2

So how is it doing the deterministic part? It is running python code -

Understanding the data structure
Sales by rep per year

So what assumptions did GenAI make based on my prompt? The prompt was - "what is the sales per rep per year with year on columns and rep on rows".

It had to translate what I meant by "Sales", "Rep", "Year". It was smart enough to know that Sales = 'Total' and Rep = 'Rep' and Year = 'Year from the Order Date' field.

What if I had meant Sales as in total number of units sold? I would have to change the prompt to - "what is the sales in units per rep per year with year on columns and rep on rows"

This is no different than what I would do in Tableau. Drag and drop the Total field and if I wanted Units, I'd drag that field over.

When I asked it to generate sales in units per rep per year it generated this code -

# Calculate total units sold per rep per year

units_per_rep_per_year = df.pivot_table(index='Rep', columns='Year', values='Units', aggfunc='sum')

tools.display_dataframe_to_user(name="Units Sold Per Rep Per Year", dataframe=units_per_rep_per_year)

# Display the result

units_per_rep_per_year

Somehow it knew how to translate my query into these attributes - index='Rep', columns='Year', values='Units', aggfunc='sum' in the function df.pivot_table.

In my next post I will see if I can figure out what errors could creep in in the code generation piece of this black box.

要查看或添加评论,请登录

Vikram Ekambaram的更多文章

  • Using GenAI to understand a 15000 word essay on AI

    Using GenAI to understand a 15000 word essay on AI

    Last week, Dario Amodei the CEO of Anthropic wrote a 15000 word essay on AI. As the CEO of Anthropic, obviously he is…

  • Revisiting GenAI for Images

    Revisiting GenAI for Images

    When I first started using GenAI for setting up my LLC, one of the use cases I wanted to try out was to use GenAI to…

    3 条评论
  • GenAI for my business

    GenAI for my business

    When I decided to start my own business back in Feb, there were so many unknowns - I was not sure if I would make any…

    2 条评论
  • Using GenAI to find case study snippets (a RAG example)

    Using GenAI to find case study snippets (a RAG example)

    This is a use case that a lot of us in sales / GTM can relate to. You have a meeting with an executive at a company or…

    7 条评论
  • Life of a GenAI General Contractor - Poly Employment

    Life of a GenAI General Contractor - Poly Employment

    Yes..

    4 条评论
  • GenAI, Security and Privacy

    GenAI, Security and Privacy

    This has been a long overdue article and it comes up in every customer conversation I have around GenAI. Since a lot of…

    3 条评论
  • Identifying GenAI Use Cases

    Identifying GenAI Use Cases

    Having spent the last 6 months doing nothing but GenAI work and GenAI research and using GenAI products, I have been…

    6 条评论
  • GenAI for CS - Handoff documents

    GenAI for CS - Handoff documents

    The one function that I have struggled to find meaningful use cases for GenAI has been CS. I spent years in CS and know…

    3 条评论
  • GenAI and SaaS - What does the future hold?

    GenAI and SaaS - What does the future hold?

    I had started writing this article a week ago but was not really sure what I was conveying and then this article (Death…

    7 条评论
  • GenAI for creating proposals and SOWs

    GenAI for creating proposals and SOWs

    As I go through how various functions in GTM can leverage GenAI (I shared about SDR, PMM and Enablement in my last…

    4 条评论

社区洞察

其他会员也浏览了