登录查看更多内容

Building an AI Technical Assistant: Leveraging ChatGPT and Embeddings

Eric Cheng

Leadership | Strategic Thinking | Problem Solving | Technologist

发布日期: 2023年3月29日

Without saying, there has been a great deal of activity and buzz surrounding OpenAI and more specifically ChatGPT. This has generated (no pun intended ??) a great deal of excitement and discussion across social media and the boundless potential that generative AI can bring across various industry verticals.

Recently I dipped my toes into the water and tested the summation capabilities that is possible with ChatGPT. In short, it does a great job at taking the information we have in 'technician speak' and converting the technical jargon and often incomplete sentences into 'customer speak' with the aim of making it easier for customers to understand. The below is an example of the outcome and although this example of ChatGPT is impressive, I don't think it fully demonstrates the capabilities and potential of generative AI.

So what's next? What if were able to add a conversational layer over our own expansive data sources which would then enable employees like technicians to conduct semantic search and draw insights from the data. Imagine being able to go beyond lexical searches e.g. searching for an error code and to be able to ask start asking complex questions of the data and receive comprehensive detailed responses from the data.

What other machines has had the same error code in the past 6 months?
What other PC200-8 excavators has had the same error code within 50 hours of the same SMR (hours of operation)?
What were the resolution and replacement part numbers in each of these cases?
Do we have the part number in stock? If so, which warehouse?

The possibilities and benefits here are endless as we pretty much can create an AI technical assistant with the knowledge that our technicians have acquired over the years. So with this said and as a part of my own learning, I have been working with one of our engineer to create and deploy a prototype for "TechGPT", a chatbot that can aid with technicians whilst they are out in the field using our own data.

As you may already know, ChatGPT was trained on a large corpus of text spanning a diverse range of texts including books, articles, websites, and other forms of written text which were collected up until 2021. What it wasn't trained on was 'our' data and as such, asking questions about our own data won't really work, at least for now.

Fortunately, there are a few options here. One of them is involves using embeddings. Through embeddings, we can create numerical representations (vectors) of our text ranging from words to entire paragraphs. By using vectors, we can then apply mathematical concepts to the proximity or how similar two vectors or two pieces of text are.

As an example, here I am using the text-embedding-ada-002 model to create vectors for the names of fruit. A common way to compare vectors is cosine similarity which is way beyond my mere comprehension, but it basically measures the angle of two vectors within a range of 0 and 1.

For this example, I extracted a portion of a dataset from Azure Cognitive Search into a .CSV file. The dataset is from our troubleshooting data which includes a detailed record of all faults and the corresponding corrective actions taken worldwide. Because the data comes from all around the world, the data contained is made up of many different languages.

Steve Nouri 10 个月前

Chatbot Nirvana is About to come for ChatGPT

Michael Spencer 1 年前

Which AI Reigns Supreme? ChatGPT or Claude AI - The…

Joshua B. Lee 1 年前

To prepare the data, features corresponding the machine and issue such as the model, serial number, findings and any corrective actions were concatenated into a single column. Next, the column was fed into the text-embedding-ada-002 model tog generate embeddings. For this scenario, these embeddings were then saved as a pickle file.

In a production workflow, the embeddings could be stored in various platforms and services, such as Azure Cognitive Search, Cosmos DB, SQL database, or blob storage.

With our embeddings now available, the next step is to use a chatbot that can offer a user interface to collect input from the user, match the input with the generated embeddings, and generate a response to return to the user. Here Gradio was to create the UI elements to mimic chatbot functionality. In a nutshell, the chatbot will:

Accept a user input in the form of a question e.g. "what is error code DK30KX?"
Maintain state information of each turn to determine whether a question is asked for the first time or is related to a previous conversation. I came across this great example of using Cosmos DB to store this history which will be useful to ensure scalability and performance in the future. The GitHub repo is here.
Process the user's query by converting it into an embedding
For new question, apply cosine similarity between the query embedding and the embeddings in our pickle file (or storage of choice) to identify the closest vectors or pieces of text
For questions related to a previous conversation, use the ChatCompletion model to generate a response that is contextually appropriate with previous turns
Return the response back to the user in a clear and friendly manner

For deployment, we containerised our Python script and pushed the image to an Azure Container Registry. From there, it was relatively straightforward to create an Linux Azure App Service from the docker image. Again, you could interchangeably use Azure Container Instances or Azure Kubernetes Services depending on your requirements.

Overall, generative AI is an effective way to create virtual assistants and experts that can provide personalized, scalable, consistent, and accurate responses to user queries, based on historical data. Every organisation around the world has vast amounts of data and generative AI can leverage this data to create insights and create value in various domains, such as customer service, healthcare, and finance.

要查看或添加评论，请登录

Eric Cheng的更多文章

Unleashing the Power of LLM: The Art of Prompt Design and Fine Tuning

2023年5月2日

Unleashing the Power of LLM: The Art of Prompt Design and Fine Tuning

Recently, I came across a requirement whereby I needed to implement Named Entity Recognition (NER) to extract the model…

1 条评论
The importance of APIM

2023年3月18日

The importance of APIM

Many organisations, regardless of industry and size, will have data spread across a number of disparate systems and…

2 条评论
Cloud Defender for DevOps

2022年11月8日

Cloud Defender for DevOps

Microsoft Defender for Cloud is Microsoft's integrated security solution for cloud security posture management (CSPM)…
GitHub Copilot - a pair programming match made in heaven?

2022年6月28日

GitHub Copilot - a pair programming match made in heaven?

If you are in the tech space, you may have read about the splash AI is making when it comes to software development and…
Partitioning your Cosmos DB

2022年6月24日

Partitioning your Cosmos DB

One of the exciting announcements from last year's Microsoft Build was that the Cosmos DB serverless tier was becoming…
PowerApps Wrap

2022年5月4日

PowerApps Wrap

Unless you have been living under a rock, you would have heard about PowerApps. PowerApps is Microsoft's low code…

4 条评论
Creating a NSW COVID rules chatbot

2021年10月9日

Creating a NSW COVID rules chatbot

Chatbots are becoming increasingly common (or even the norm) across many industry verticals. Imagine having an…
Pokémon, Custom connectors and the Power Platform

2021年9月15日

Pokémon, Custom connectors and the Power Platform

I guess it's comes to no surprise that the Power Platform comes loaded with connectors. I'm not even going to list the…
Overview of D365 Customer Voice

2021年8月2日

Overview of D365 Customer Voice

If you have been involved in managing customer feedback within the Microsoft ecosystem, chances are you have probably…
A glimpse into Azure Digital Twin

2021年6月16日

A glimpse into Azure Digital Twin

So far, I have only been using a single AZ3166 to send telemetry data back to an IoT Central application. In the real…

See all articles

Building an AI Technical Assistant: Leveraging ChatGPT and Embeddings

Eric Cheng

Leadership | Strategic Thinking | Problem Solving | Technologist

领英推荐

Eric Cheng的更多文章

社区洞察

其他会员也浏览了

What is ChatGPT: A Comprehensive Guide

19 Generative AI Tools Like ChatGPT That You Cannot Ignore In 2023

A mostly unbiased comparison of OpenAI ChatGPT, Google Bard, and IBM Watsonx

CHAT GPT AND FUTURE: THE WAY AHEAD

ChatGPT for real business (not just another article on bots)

ChatGPT beware: How to spot AI generated Text ?? ??

The release of new AI chatbot has led to speculations and sparked the age-old man vs machine debate.

How Does ChatGPT Answer Our Questions? - With Knowledge Graph and Graph Database

ChatGPT as a Service: What does it really mean to businesses ?

Navigating AI's Role in Strategy: An In-depth Exploration of ChatGPT's Strengths and Shortcoming

领英推荐

Eric Cheng的更多文章

Unleashing the Power of LLM: The Art of Prompt Design and Fine Tuning

The importance of APIM

Cloud Defender for DevOps

GitHub Copilot - a pair programming match made in heaven?

Partitioning your Cosmos DB

PowerApps Wrap

Creating a NSW COVID rules chatbot

Pokémon, Custom connectors and the Power Platform

Overview of D365 Customer Voice

A glimpse into Azure Digital Twin

社区洞察

其他会员也浏览了

What is ChatGPT: A Comprehensive Guide

19 Generative AI Tools Like ChatGPT That You Cannot Ignore In 2023

A mostly unbiased comparison of OpenAI ChatGPT, Google Bard, and IBM Watsonx

CHAT GPT AND FUTURE: THE WAY AHEAD

ChatGPT for real business (not just another article on bots)

ChatGPT beware: How to spot AI generated Text ?? ??

The release of new AI chatbot has led to speculations and sparked the age-old man vs machine debate.

How Does ChatGPT Answer Our Questions? - With Knowledge Graph and Graph Database

ChatGPT as a Service: What does it really mean to businesses ?

Navigating AI's Role in Strategy: An In-depth Exploration of ChatGPT's Strengths and Shortcoming