登录查看更多内容

Creating Conversational AI with User-Centric Mindset: A Step-by-Step Guide with ChatGPT-4 ??????(??)

bld.ai

Let’s design, build, and maintain human-centered AI products together.

发布日期: 2023年4月13日

So in this article, we talk about how to create a chatbot that responds in the way that you desire using ChatGPT-4, the latest GPT release by OpenAI — setting the architecture, knowledge retrieval, prompt design, and engineering. You will see how setting certain parameters of the API and/or prompt engineering can make a difference in the tone, personality, and voice of the response. Let’s begin!

Step 1: Picking the right model (GPT-4)

Note: Initially we built the chatbot using GPT-3.5, but we updated it by using GPT-4 — the following is to show how you can go about choosing what model to use:

First things first, it is time to find the right GPT model to use for the chatbot. Out of the 5 latest GPT-3.5 models (the most recent version out at the time of development), we decided on gpt-3.5-turbo model for the following reasons:

it is the most optimized for chatting, fitting this use case perfectly
cheaper than text-davinci-003 as per the document
it’s the model that powers ChatGPT

Step 2: Setting the Right Architecture

Now that we picked the API key, it’s time to set the architecture. Let’s take a step back and think of the goal of the chatbot — even though our user is chatting with a non-human to get answers, we want to mimic a human conversation. To create a natural conversation that fulfills this goal, let’s answer three main questions to figure out what variables we are optimizing for:

1. What enables someone to answer a question well?

Knowledge of the topic being asked
Context of the conversation

2. Then, what makes up someone’s response?

Their interpretation
Voice
Tone

3. Outside of that, how can we enable the architecture to work efficiently at scale?

The answers to the three questions above now can be translated to create the following architecture that consists of three main modules:

User Interface: is obvious, this is where users enter their chat and queries, then receive responses
Knowledge Search: this is where we store the knowledge - internal documents about Bld.ai so that the chatbot can form responses to questions regarding the company

this is also a place to consider scalability and efficiency — we introduce semantic search on the text snippets from the documents so that we only retrieve relevant information out of all the internal documents
→ this way, as the number of internal documents grows over time, we don’t overwhelm chatGPT with endless knowledge and spend an unnecessary amount of tokens

3. Prompt engineering: this is where we craft the prompt to ChatGPT to get the quality response we want, this is comprised of:

Saving and sending chat history to provide context to ChatGPT to improve the quality of responses
Setting the parameters such as temperature
Modifying the user’s prompt
→ Sending the relevant knowledge by knowledge search from above

Now that we have a high-level overview of how the modules are set up, let’s dive deeper into their implementation!

Step 3: Prompt Engineering

Prompt engineering consists of sending a combination of:

assigning the value of temperature
setting guidelines for how Bill-d responds to the user’s prompts
knowledge
previous chat for context

to ChatGPT via the API.

The temperature is a variable of the ChatGPT API key that we can easily set. I’m going to divide this Step 3 into two main sections:

Setting the right amount of interpretation and randomness by assigning an appropriate value of temperature
Experiments of combinations of different values assigned for the guideline, knowledge, and previous chat — and choosing which combination to go with

1) Setting the temperature

Here is the very first version of Bill-d, but they don’t sound much different than going to read the document itself. This is because the temperature which determines the amount of randomness and creativity of the model’s responses was set to zero. This means that the model will always select the highest probability word to complete the next word, which will always be a word from the knowledge, i.e. documents.

Tuning temperature leads to more variation, randomness, and creativity. However, a very high value would increase the risk of “hallucination” which means an unpredicted output like being off-topic or generating repeated words and gibberish that doesn’t make sense grammatically. Here, we see that the response with temperature = 1 is a sentence full of words that do not relate to each other; randomness at its highest.

Trying out the different values of temperatures, we see an optimal performance with a temperature set at 0.3 giving the model a little bit of flexibility to actually respond as a language model, not just a knowledge search bot, and a bit of sanity to respond with proper words and sentence structure.

As you can see, the temperature can make a big difference in the quality of the response a user receives by adding its interpretation and elaboration, so I recommend that you play around with the intensity of temperature to see what fits your end-user’s needs the best.

2) The 4 Experiments

The purpose of this exercise is to show how different combinations of engineered prompts, knowledge, previous chat, and setting a specific tone impact the overall voice of the response. Here, we create four experiments:

Experiment #1: Sending all relevant domain knowledge with a baseline engineered prompt and previous chat

Experiment #2: Only send the most relevant knowledge by weighing existing knowledge based on similarities with the user’s prompt

Manohar CH 8 个月前

GPT-4o: A Game-Changer in Human-AI Interactions

Sajid Khetani 5 个月前

ChatGPT Unlimited Free Chatbot

MD Shahidul Islam 4 个月前

Experiment #3: No guided engineered prompt but send all relevant domain knowledge and previous chat

Experiment #4: Same as experiment #1 but talk in the tone of a Pirate

For this set of experiments, we ask each experiment the same question for control:

What are some challenges that I should foresee as I join Bld.ai as a new product manager?

Let’s see what the response looks like for each of the experiments and compare to see what we want for Bill-d!

Experiment #1: Baseline vs Experiment #2: Weighted Retrievals

For both experiments 1 and 2, we keep the engineered prompt the same; we guide ChatGPT to answer only with the knowledge given by the internal documents, etc. What’s different between the two experiments is the knowledge we use to answer the user’s questions.

In Experiment 1 , we feed all the retrieved information from the internal documents to ChatGPT to generate the response with its interpretation. Versus in Experiment 2, we only feed the retrieved information that we think is the most relevant to the question asked by the user by weighting information based on semantic search. This means that ChatGPT won’t be as informed but will give more focused and specified answers.

When comparing the responses by Experiment 1: Baseline and Experiment 2: Weighted retrievals, you can see that the response from Experiment 2 is more focused on the context and knowledge about Bld.ai vs Experiment 1 is more general on the knowledge.

Experiment #3: Unguided

As for Experiment #3: Unguided, we provide knowledge and a query without a proper prompt/guide - equivalent to how ChatGPT would work if you give it a document and ask about it, meaning that the response will be more general and less focused in the context of Bld.ai.

As you can see, compared to Experiment #1: Baseline and Experiment #2: Weighted Retrievals, the response in Experiment 3: Unguided gives more of a general list of challenges that PMs face, but is much less explained in the context of Bld.ai and its processes.

Experiment #4: Adding voice!

Lastly, this Experiment 4: Pirate is created to show what it’s like to add voice to the response. Here, we guide the prompt to speak like a pirate on top of our Experiment 1: Baseline experiment. It’s a lot shorter than Experiment 1 and you can see a drastic difference in its personality - it is less organized and elaborate but tells you the right steps to take.

The tone and voice of any language are important in conveying any information in a conversation — although this pirate is sassy, imagine asking GPT to speak like a five-year-old for children interacting with a gaming product or Elon Musk for Tesla employees to ask any questions about the company.

From the four experiments, we see that the tone, personality, depth, and breadth of the responses are different based on the values of the temperature, guideline, knowledge, and voice. How to go about choosing which experiment to deploy to the users can be done based on the hypothesis of what’s most impactful to the user or experimenting and gathering feedback from users — and using data to roll out the winning experiment to general.

Step 4: Setting up knowledge retrieval enables efficiency and scalability

Now that we have explored what we want for Bill-d’s voice and personality when responding, it’s time to think about how we can make the implementation scalable so that as the number of documents grows to expand the knowledge set, the speed remains constant.

Overall, it’s crucial that we set up a semantic search so that we only retrieve the information that Bill-d needs to answer the user’s question instead of going through every document including irrelevant knowledge. This will save time and costs especially as the number of documents grows.

Secondly, it is important to note that when trying to use the same architecture for large documents or when connecting it to a large knowledge base of questions, it is crucial to have a fast vector document store like Zilliz or Milvus to store the embedding of those documents rather than relying on the I/O of the system reading from CSV files or similar solutions.

Step 5: Creating an optimal conversation experience with ChatGPT

Lastly, when designing a chatbot with ChatGPT, user experience is an essential consideration. Because ChatGPT is a language model that generates responses based on input, the chatbot’s responses can sometimes be unpredictable or unrelated to the user’s query. We can see that this can be a frustrating experience because it’s hard to understand why the response is the way it is, so we add a feature where users can see what knowledge from the internal document it used to derive the response from, which help the user understand how the response is generated by Bill-d.

To ensure a positive user experience and drive continuous engagement, we should not only consider the chatbot’s tone and personality, but we should also care about the user interface and how users interact with the product. By designing an experience that is intuitive, responsive, and user-friendly, businesses can improve customer satisfaction, increase engagement, and build stronger relationships with their customers.

Takeaways

Wrapping it all up, building a chatbot with OpenAI’s ChatGPT-4 was incredibly fast, but to have a great response and outcome, it is crucial to consider user experience when designing a chatbot with ChatGPT.

When talking about designing user experience with ChatGPT, this includes not only interface design but prompt design and engineering — it’s a science of its own. To ensure a positive user experience, builders like engineers, designers, and product managers should consider the chatbot’s tone and personality, the user interface, and the chatbot’s ability to understand and respond to user queries. Despite the challenges involved in designing an effective chatbot, ChatGPT offers tremendous opportunities for businesses to automate customer service, personalize customer interactions, and improve communication with customers and partners around the world. By leveraging the power of ChatGPT and prioritizing user experience, businesses can build chatbots that improve customer satisfaction, increase engagement, and drive business growth.

Since working on this prototype, OpenAI has released plug-ins to make it even more accessible and efficient for companies to adopt Generative AI in their product such as the retrieval plug-in. With these advancements, businesses can expect to build chatbots that are even more effective and user-friendly.

So what does this all mean when building a solution with ChatGPT?

In the end, success is defined by the end-user’s satisfaction with the response they get from the product. As builders, we can do a variety of experiments to gather feedback from users to determine or even personalize the type of prompt-engineered responses they get. What’s most important is understanding that based on what engineered prompt, previous chat, the knowledge we send, what kind of responses will be outputted, and having experts like engineers, and product managers who can make informed decisions to make the best type of response the end-user needs.

some modifications are made by ChatGPT-4

Credit: Original article published by Jee Soo (Eunice) Choi at https://medium.com/@eunice.choi/designing-a-chatbot-with-chatgpt-79afd818cbff?

The bld.ai team behind this - Amazing Engineers; Ahmed Mohamedeen, Mohamed Alaa for building this with user-centric mindset, and Danny Castonguay, Andres Desantes, Liam Hough, Kenny Wei?for review

Jonathan Lhrar

Enterprise Transformation and Delivery Leadership

1 年

Brilliant piece. Tons of important takeaways. Thanks for sharing.

2 次回应

Roger Borycki

Helping Companies Improve | Reliability | Operations | Maintenance | Engineering

1 年

Thanks for sharing what is behind the curtains.

2 次回应

查看更多评论

要查看或添加评论，请登录

Creating Conversational AI with User-Centric Mindset: A Step-by-Step Guide with ChatGPT-4 ??????(??)

bld.ai

Let’s design, build, and maintain human-centered AI products together.

Step 1: Picking the right model (GPT-4)

Step 2: Setting the Right Architecture

Step 3: Prompt Engineering

领英推荐

Step 4: Setting up knowledge retrieval enables efficiency and scalability

Step 5: Creating an optimal conversation experience with ChatGPT

Takeaways

So what does this all mean when building a solution with ChatGPT?

更多精彩文章

社区洞察

其他会员也浏览了

ChatGPT Unlimited Free Chatbot

ChatGPT vs. Google Bard: Choosing the Right AI Chatbot

Understanding the Distinction: Generative AI vs. ChatGPT

Unveiling ChatGPT-4o, Llama3, DBRx, and More GenAI Breakthroughs ??

The Magic of ChatGPT: Unleashing Conversational AI with Our Development Services

New product development for consumer products using AI chatbots

Is banning chat GPT in and outside the classrooms and schools the answer?

Exploring ChatGPT: Revolutionizing Conversational AI

8 Best AI Chatbots of 2023

ChatGPT Plus vs Perplexity AI: ?? Decoding the Future of AI Conversations ??

Step 1: Picking the right model (GPT-4)

Step 2: Setting the Right Architecture

Step 3: Prompt Engineering

领英推荐

Step 4: Setting up knowledge retrieval enables efficiency and scalability

Step 5: Creating an optimal conversation experience with ChatGPT

Takeaways

So what does this all mean when building a solution with ChatGPT?

Two ways to join our network

2023年8月2日

Hello World :)

2023年6月30日

bld.ai partners with Handprint to unlock the power of AI and foster nature restoration action at scale

2023年3月23日

社区洞察

其他会员也浏览了

ChatGPT Unlimited Free Chatbot

ChatGPT vs. Google Bard: Choosing the Right AI Chatbot

Understanding the Distinction: Generative AI vs. ChatGPT

Unveiling ChatGPT-4o, Llama3, DBRx, and More GenAI Breakthroughs ??

The Magic of ChatGPT: Unleashing Conversational AI with Our Development Services

New product development for consumer products using AI chatbots

Is banning chat GPT in and outside the classrooms and schools the answer?

Exploring ChatGPT: Revolutionizing Conversational AI

8 Best AI Chatbots of 2023

ChatGPT Plus vs Perplexity AI: ?? Decoding the Future of AI Conversations ??