登录查看更多内容

What questions should you ask of Chat-GPT based analytics platforms?

Harry Powell

Data science leader with track record of innovation and value creation

发布日期: 2023年3月31日

You know the scenario. You are flooded with sales guys showing you amazing software applications based on some AI technology that is going to change your world. It looks amazing, it really does, and when you see the demo you genuinely think that your company needs to invest or it will get left behind.

But you know, underneath it all, that AI is not a solved problem. This software will have limitations, and maybe they will mean that the software is not right for your firm yet. It's just that you don’t know enough to ask the questions to expose what those limitations are. So you are kind of stuck.

A case in point is the explosion of GPT/Chat interfaces for data analytics. These promise that everyone in your organisation will be able to ask questions of your data, even if they aren’t analysts, just by asking natural language questions. Instead of typing out a SQL query to calculate a KPI, you can simply ask your computer to access the data and tell you what you need to know.

This is a persuasive value proposition. The target market is large; most corporates aspire to data-driven decision making but have struggled to upskill staff. Despite having terabytes of valuable data in data lakes, it isn’t being used at the coal face. Chat promises to remove the hurdle of having to train staff so that ordinary business people can use the data themselves.

I have no doubt that plenty of CEOs will want this, and I am relatively confident that it is technically plausible given the new large language models available. So investors will be interested too.

Perhaps the most important questions to ask are not directly related to the obvious functionality. All of these applications will all be able to turn language into numbers, and the sales guys will have a long list of examples that will be compelling. But it is often the unstated stuff that matters most. Sure the? output looks impressive, but what are the implicit assumptions underlying this process, the unspoken constraints that we all just take for granted when working with real analysts that the software will need to replicate if it to be a true replacement?

Here are some of the questions I would ask.

Does it allow me to ask for an answer or do I still have to tell it what to do?

This is the core functionality. You need to be able to ask something like “tell me the most profitable product” without specifying how the data is to be joined, how margin is to be calculated, what outliers are to be removed etc. All that detail should be taken as implicit depending on context; if the user needs to know it, then it's not clear that there is any gain in using Chat Analytics.

Does it prompt me to ask new and better questions of the data?

If you are always going to ask the same questions you may as well have a dashboard. The point of analytics is not necessarily going straight to the answer, no matter how efficient that sounds. It is to discover facts you didn’t yet know. If the AI allows the user to sidestep the creative process, then the AI had better be able to do that for you as well. So you need your Chat Analytics to say “here’s the answer you asked for, but given the context you might like to know about this as well”. Without this functionality you may end up only noticing problems once they become sufficiently embedded to affect your core KPIs.

Richard Foster-Fletcher ?? 4 个月前

Why You Need a Custom AI Assistant (And How It Can…

Ann-Murray Brown ???????? 5 个月前

The ChatGPT Observer #25

Zeyad Sweidan 3 个月前

How does it check that the query it executed is what the user meant?

One of the great advantages of conventional database queries is that it is possible to express yourself unambiguously. Of course that isn’t always what happens, particularly with complex queries. But any one statement has one meaning, and that meaning is the same to everyone. Chat Analytics needs to be able to disambiguate. This could be simple, but it needs to be able to check between potentially subtle distinctions in a way that the business user can understand.

How do you help users understand how the results should be interpreted?

Analytics results can be surprisingly hard to rely on; they tend to rely upon assumptions which may or may not be valid. Sometimes these assumptions don’t matter, other times they do. Analysts themselves often ignore the assumptions and get away with it because they know how the result is going to be used. Reporting a long list of conditions with every result will just end up with them being ignored. A good Chat Analytics system will be able to flag up material assumptions and ignore the others.

Is it able to to check that the answers are right?

Just like humans, Chat-GPT doesn’t always get the answer right but, dangerously, tends to report results with a degree of authority bordering on certainty. Good analysts doubt themselves. They tend to try to calculate the same thing in two different ways to make sure it is correct. How Chat Analytics does this, and how it determines what to do if the answers don’t match will be critical to success. After all it would be unhelpful simply to deliver both results to the user and ask them to decide.

Knowing the answers to questions like that will tell you how much training your workforce will need despite the new interface. It is possible that you could have a Chat Analytics interface, and yet still have to upskill analytically literate people to ask the big picture questions.

Perhaps the biggest question of them all is not to do with the interface, but with the underlying data.?

Does it need to be fed with good data?

How does it handle incomplete, incompatible and incorrect data? What does it do to bridge the gap. What intuitions does it have about the business that enable it to make the best of a bad data environment. Given that data engineering is 80% of data analytics, it is possible that Chat Analytics platforms are answering the wrong question. Instead of asking “how can I get everyone to do analytics?” maybe you should be asking “how can I get everyone good data?”

But that’s a story for another day.

Leo Meyerovich

Founder Graphistry.com & Louie.AI: 100X Data-intensive investigations w/ genAI-native design and GPU graph AI. Started GFQL, GPU dataframes, web FRP. Hiring marketing, engineering, data science @ graphistry.com/careers .

1 年

Good questions, and tip of the iceberg for enterprise envs with all sorts of viz, wrangling, safety, perf, & collab needs We are piloting a GPT environment for data analyst & BI teams, including with graph DB connectors for F500 teams, so if a problem for Tigergraph users, happy to chat! Some of our users need this to go all the way to legal court evidence admissibility and preventing novice staff from taking down important accounts , so a lot to get right when teaching an LLM to work with a DB.

2 次回应

Ken Lord

He/Him. Black lives matter. Making the world a little better, one byte at a time.

1 年

I feel like the part people are forgetting about using these generative-AI the most is that they're completely dependent on what data sources they have access to when training/analyzing. There's also a matter of more sources doesn't inherently equal better sources... connecting a hundred low quality data sources and one high quality data source is still likely (if not weighted properly) to end up with a model that produces results that are very confident and very wrong.

Jan Jaroszuk

Data Scientist at QuantumBlack, AI by McKinsey

1 年

How do you see the confidentiality matters here? Wouldn't the deployed chat need to cross-check vs its source for your data? Processing company's data off-prem, potentially using to further train the model etc... Sounds like a legally murky ground (or expensive opt outs).

查看更多评论

要查看或添加评论，请登录

Harry Powell的更多文章

Graph use-case archetypes

2023年5月5日

Graph use-case archetypes

This note is to try to help you think about use cases for graph data analytics and machine learning in your…

1 条评论
Driving sustainable growth in banks by connecting customer data using a graph database

2023年5月4日

Driving sustainable growth in banks by connecting customer data using a graph database

Growing a banking business requires you to make good decisions at each stage of the value cycle from acquiring new…

8 条评论
How to Think Differently

2022年1月22日

How to Think Differently

Last year, I wrote a series of micro-blogs about how to think differently about data science and analytics questions…

5 条评论
A business leader’s short guide to Graph Databases: What they are and why you need them.

2021年12月19日

A business leader’s short guide to Graph Databases: What they are and why you need them.

The word “graph” is very fashionable in IT circles right now, but graphs have actually been around for a while (graph…

11 条评论
A tribute to my InDigital colleagues at JLR

2021年12月17日

A tribute to my InDigital colleagues at JLR

Dear JLR InDigital Colleagues, In my final few days at Jaguar Land Rover I have been thinking quite a lot about what we…

22 条评论
Thinking differently: Avoiding Optimisation 1/2

2021年11月8日

Thinking differently: Avoiding Optimisation 1/2

A lot of business people and analysts are obsessed with optimisation; the best possible business strategy; a single…

10 条评论
Bayesian A-B Testing

2021年5月23日

Bayesian A-B Testing

… or how we were able to make decisions about price tests in half the time..

5 条评论
Graph Customer Similarity

2021年5月10日

Graph Customer Similarity

… or how we devised a new way to calculate similarity across networks of shops… This project was more of a piece of…

9 条评论
Data Science Case Study 2: NLP Complaint Classification

2021年4月7日

Data Science Case Study 2: NLP Complaint Classification

..

6 条评论
How to be presented to

2021年3月29日

How to be presented to

There is a multitude of courses on how to present to your boss, but have you ever seen one that teaches your boss to be…

8 条评论

See all articles

What questions should you ask of Chat-GPT based analytics platforms?

Harry Powell

Data science leader with track record of innovation and value creation

领英推荐

Harry Powell的更多文章

社区洞察

其他会员也浏览了

What’s in a LLM? The public vs private debate

The Fastest, Simplest Way To Implement Data-Driven Gen-AI Solutions For Your Organization

A Guide for CIOs in the Age of Gen-AI

Sovereign AI - in plain English

Top 5 AI Concerns and How to Mitigate Them Responsibly

From AI to AR – How To Choose Tech That Won’t Frustrate Your Customers

The Custom GPT Problem

Data Economy AI agent Strategy Options

My First RAG Use-Case - Key Insights & Lessons Learned

Insight vs. Hallucination: Can You Trust ChatGPT's Data Inferences?

领英推荐

Harry Powell的更多文章

Graph use-case archetypes

Driving sustainable growth in banks by connecting customer data using a graph database

How to Think Differently

A business leader’s short guide to Graph Databases: What they are and why you need them.

A tribute to my InDigital colleagues at JLR

Thinking differently: Avoiding Optimisation 1/2

Bayesian A-B Testing

Graph Customer Similarity

Data Science Case Study 2: NLP Complaint Classification

How to be presented to

社区洞察

其他会员也浏览了

What’s in a LLM? The public vs private debate

The Fastest, Simplest Way To Implement Data-Driven Gen-AI Solutions For Your Organization

A Guide for CIOs in the Age of Gen-AI

Sovereign AI - in plain English

Top 5 AI Concerns and How to Mitigate Them Responsibly

From AI to AR – How To Choose Tech That Won’t Frustrate Your Customers

The Custom GPT Problem

Data Economy AI agent Strategy Options

My First RAG Use-Case - Key Insights & Lessons Learned

Insight vs. Hallucination: Can You Trust ChatGPT's Data Inferences?