登录查看更多内容

What's So Challenging About Building Chatbots? Drawing lessons from the trenches.

Balaji Viswanathan Ph.D.

Building document AI at scale -- organizing, searching and summarizing enterprise data.

发布日期: 2024年4月27日

Everything looks easy until you are the one building it.

In early 2023, I was discussing with a friend who was spearheading data transformation at a bank. With the release of GPT-4, he announced plans to develop their new customer support chatbots using this advanced AI model.

“That sounds exciting,” I remarked. “How long will the project take?”

“With LLMs, building chatbots is now straightforward. We plan to launch the proof of concept by the start of Q3,” he confidently replied.

"Wow, that's just three months away," I exclaimed.

“The plan is to integrate Pinecone and GPT-4. The basic RAG pipelines are set; we just need to layer on the user interface.”

“Keep me posted on your progress,” I said.

A year later, during a follow-up lunch, I learned the project was floundering. The simplicity that was so appealing initially had turned into a quagmire of complexity in what has been dubbed the 'ChatGPT era'.

Contrary to many pundits’ predictions, well-functioning chatbots haven't become ubiquitous. While there's an increase in their numbers since 2022, very few are effectively usable. Most enterprise tasks require conversational designs far too sophisticated for current AI capabilities.

Even leading AI firms like Amazon, Google, and Microsoft haven’t fully implemented them in significant customer-facing platforms or crucial operational areas. It's curious—despite Azure selling AI solutions, if you encounter issues deploying a virtual machine or managing data store permissions, you still need human assistance.

OpenAI.com experimented with a chatbot, but it proved problematic, often requiring human intervention even for straightforward issues. Eventually, they shifted to offering predefined response options instead of free-flowing conversations.

Even vendors building chatbots often get tripped by simple questions.

"In theory there is no difference between theory and practice - in practice there is"

Here are the high level challenges in building a good chatbot?

Ritesh Kanjee 7 个月前

Chatbot AI – Smart AI Chatbot – Copy & Paste “1 Piece…

Andrew Larder 10 个月前

Introducing RPA and Best RPA Chatbot Tools

Flatworld Solutions 2 年前

Generating the domain knowledge: A good human service agent brings in a lot of domain expertise to solve a problem. For some obvious problems & solutions [restart that darn router] this is simple. But, once you start build anything serious, you find your knowledge base stuck in all sorts of applications and interfaces. From Apache Spark to PDF manuals to previous chat logs to undocumented ideas is in the brains of the agent, there is a lot of data needed to solve the problem.
Building the retriever: Once you have a decent knowledge base, the problem comes in retrieving the most relevant chunks. Vendors of AI products have made the RAG (Retriever Augmented Generation) a deceptively simple idea. You take your data, send it to a AI model that will generate a big list of numbers and then you can use them to do similarity search. Rewording the famous episode from the Seinfeld, the Vector DBs know how to take in the data, but just not rank the most important chunks that fits your problem.

3. Generating the process flow

We have the knowledge now, how do we generate the process? A good conversation is far beyond just the domain knowledge. There is a particular cadence and flow to a good conversation. In many domains, humans have already figured out the flow. We often know how to do a dinner table conversation that is different from a cocktail conversation to an elevator pitch. A customer service agent has a particular sequence of things where they would try to solve a problem.

Beyond simple customer service, as you start building advisory robots things become even more complicated. LLMs don't know about conversation flows. You need to design that.

It can be a combination of rules that we humans learned over time combined with machine learning techniques that would allow you to predict what should be next thing to ask. These are often represented as decision trees.

How much of flexibility we need to add between hardcoded flows and free flowing AI is a key design question.

4. How do we store the essential bits of user conversation?

GPT4 and other LLMs don't give you a memory. By default they are stateless. How to represent the conversation in memory is a complex topic.

Why do we need this?As the conversation gets longer [anything beyond a simple customer service agent] how to keep track of essential bits about user is challenging.

We need to store some of this data as tags in a key value store. Some of the parts we need to store in a knowledge graphs where the relationships between different parts of the conversation can be maintained. Some other parts have to just go for vector DB so that we can do some similarity search with the earlier parts of the conversation.

Other key parts of the challenge:

Promises made. How do you make sure that the chatbot doesn't promise the moon?

Latency. Unless you are using a highly optimized engine such as Groq, there can be a multi-second delay. This might not be an issue with text based chatbot, but when we involve voice the latency is quite noticeable. There are various ways to mitigate this. But at scale this is a key challenge.
Regulatory compliance. How do you make sure the chatbot messages comply with very rules set with regards to customer protection, especially in sectors such as finance and healthcare.

Information is not knowledge. Knowledge is not expertise. Expertise is not execution.

Manivelarasan Tamizharasan

Senior Consultant, Japan Industry Solutions Delivery @ Microsoft | Copilot & Azure AI Enthusiast | Power Platform & Dynamics CE Expert | Bridging Technical Innovation & Business Strategy Across Global Markets

6 个月

Thanks for sharing this from ground zero. The challenges of building effective and sophisticated chatbots to exactly mimic a human agent are real. I have personally used real-time web search results and knowledge search using Microsoft Azure AI Search (Hybrid search i.e.. both Keyword & Vector with Semantic Reranking + many other inbuilt cognitive skills) for RAG and the responses were much better with clear and concise prompts. With the ongoing advancements in Models, Model Architectures, Orchestration and RAG, I believe the gap between theory and practice will steadily close.

Mangesh Deshpande

Business Transformation | Target Operating Model | Operational Excellence | Agile & Design thinking

7 个月

Thanks for sharing! So do you think the applicability of these models lie only at a 1st level, using chatbots more as a filtering mechanism to churn out run of the mill queries and a human customer service provider gets involved beyond 2nd level which requires domain expertise.

1 次回应

查看更多评论

要查看或添加评论，请登录

Balaji Viswanathan Ph.D.的更多文章

Wondering How to Hire an AI? Evaluating Large Language Models (LLMs) could be made better.

2024年4月24日

Wondering How to Hire an AI? Evaluating Large Language Models (LLMs) could be made better.

Summary: We will cover a new way to approach AI benchmarks and evaluate one of the recent models -- Microsoft Phi3 with…
How do we evaluate the Multimodal Models for key enterprise tasks?

2024年3月21日

How do we evaluate the Multimodal Models for key enterprise tasks?

A lot of the benchmarks to evaluate LLMs come from toy problems that researchers create. While it is interesting to see…

2 条评论
Candid Lessons in Building Humanoids

2024年3月20日

Candid Lessons in Building Humanoids

A few years ago, at a technology event, a senior executive from a leading Japanese robotics company pulled me during…

10 条评论
Langflow: A simple way to build LLM applications locally without code.

2024年3月8日

Langflow: A simple way to build LLM applications locally without code.

In the olden days, we used to spend days and weeks to build a smart chatbot with a reasonable interface. Now, you can…
How Gemini and GPT4 completely messed a standard task that Claude 3 easily did.

2024年3月5日

How Gemini and GPT4 completely messed a standard task that Claude 3 easily did.

I wanted to try the best in class LLMs to understand a moderately complex table. This is a fairly standard handwritten…

4 条评论
Implementing Modern AI in your Enterprise.

2023年5月5日

Implementing Modern AI in your Enterprise.

Nearly all of you would already be using Large Language Models (LLMs) such as ChatGPT. How do you turn this into…
The cool robots we build: Big collection of unedited videos and pictures of our robots.

2020年12月12日

The cool robots we build: Big collection of unedited videos and pictures of our robots.

At Invento, we are about solving complex problems in robotics. We build autonomously navigating robots packed with a…

2 条评论
Pivoting in the COVID era.

2020年5月14日

Pivoting in the COVID era.

On March 1 we were staring at a major crisis. The way we do business was getting fundamentally altered.

12 条评论
Why were the desperate Indian migrants take railway tracks to get home?

2020年5月10日

Why were the desperate Indian migrants take railway tracks to get home?

As an 8 year old I once walked 7km on a railway track after I missed a bus to my village -- about 12km from the coastal…

2 条评论
Email is wrongly vilified

2020年4月27日

Email is wrongly vilified

"Email is the worst form of communication, except for all the other tools." Paraphrasing an old saying [wrongly…

2 条评论

See all articles

What's So Challenging About Building Chatbots? Drawing lessons from the trenches.

Balaji Viswanathan Ph.D.

Building document AI at scale -- organizing, searching and summarizing enterprise data.

领英推荐

Balaji Viswanathan Ph.D.的更多文章

社区洞察

其他会员也浏览了

Emergence of GenAI Chatbots in the BFSI Sector

How to Grow Your Business with an AI-Powered Chatbot

Step-by-Step Guide to Creating Enterprise AI Chatbot Using RAG and Reranking

Conversational AI: What it is and how you can benefit?

AI-Powered Chatbots: Transforming Customer Service in the Government

How You Should Be Thinking about AI Automation from Kognitos versus Microsoft Copilot

Six opportunities for ISPs to Leverage AI

AI Automation: The Impending Threat to Jobs

AI Chatbots - Should You Use Them?

Comparing AI Chatbot Platforms

领英推荐

Balaji Viswanathan Ph.D.的更多文章

Wondering How to Hire an AI? Evaluating Large Language Models (LLMs) could be made better.

How do we evaluate the Multimodal Models for key enterprise tasks?

Candid Lessons in Building Humanoids

Langflow: A simple way to build LLM applications locally without code.

How Gemini and GPT4 completely messed a standard task that Claude 3 easily did.

Implementing Modern AI in your Enterprise.

The cool robots we build: Big collection of unedited videos and pictures of our robots.

Pivoting in the COVID era.

Why were the desperate Indian migrants take railway tracks to get home?

Email is wrongly vilified

社区洞察

其他会员也浏览了

Emergence of GenAI Chatbots in the BFSI Sector

How to Grow Your Business with an AI-Powered Chatbot

Step-by-Step Guide to Creating Enterprise AI Chatbot Using RAG and Reranking

Conversational AI: What it is and how you can benefit?

AI-Powered Chatbots: Transforming Customer Service in the Government

How You Should Be Thinking about AI Automation from Kognitos versus Microsoft Copilot

Six opportunities for ISPs to Leverage AI

AI Automation: The Impending Threat to Jobs

AI Chatbots - Should You Use Them?

Comparing AI Chatbot Platforms