登录查看更多内容

Retrieval-Augmented Generation (RAG)for Newbies

Joel Garcia

CEO of AllCode I Need an AWS Solution Architect?

发布日期: 2024年2月28日

+ 关注

Clients come to us all the time with the following requests:?

“We are actively looking for developers to help us build our envisioned not-yet-named AI.”

We’ll hop on a call where we’ll start to discuss the idiosyncrasies of Large Language Models (LLM). We’ll dive into prompt engineering, parameter-efficient fine-tuning, fine-tuning with reinforcement learning from human feedback, etc.?

Inevitably, we’ll get to the question of, “How do we train AI on our data?”, and I’ll say, “Ah, that’s an ancient Chinese secret called Calgon. Hahah. Just kidding. No, really. You want RAG, Retrieval Augmented Generation (RAG).”

What’s RAG? How does RAG enable the ChatGPTs and Microsoft Bings of the world to know about your internal data?

To answer these questions, we need to start at the beginning.

Prompt Engineering?

When you begin interacting with a model like ChatGPT or Bing Chat powered by GPT-4, you start by issuing a prompt to elicit a response.

We’ll start with a prompt that a client of ours, Let’s Play Soccer (LPS), https://www.letsplaysoccer.com, a chain of indoor soccer facilities, receives daily. “I want to rent a field.”

Here’s a real-world example that came in through a contact form yesterday:

“We are wondering how much it would cost to rent a field for an hour weekly/bi-weekly during a weekday early in the morning (around 6am) Facility: Timpanogos Indoor Soccer”

Notice, that the request is rough. The grammar hasn’t been cleaned up, but we do get a little metadata in the facility name.

What happens when we feed this request into GPT-4??

GPT-4 spits out the following:?

“The cost of reserving a field at Let's Play Soccer depends on the facility, the time, and the frequency of your reservation. According to the Let's Play Soccer websitehttps://www.letsplaysoccer.com, the prices vary from $50 to $150 per hour for different facilities. You can also get discounts for booking multiple hours or sessions in advance https://www.letsplaysoccer.com, etc…”

The completion is decent considering the LLM model only knows what is fed into it from the LPS website, but the response from GPT-4 starts rambling; that’s the “etc…” section. The rambling isn’t quite as bad as listening to a Trump or Biden debate, but it’s close. The LLM isn’t able to succinctly answer the question, which causes issues if you try to use it in your application.? In AI lingo, this is referred to as the model is “hallucinating”.?

How do we get the LLM to know more about Let’s Play’s field rental costs to answer the prompt more succinctly??

Well, you can either try to build a multi-billion parameter model from scratch, which requires millions of GPU compute hours, or you can leverage Retrieval-Augmented Generation (RAG).?

Retrieval Augmented Generation

With RAG, you can improve the LLM without having to incur the expense of building a new model by making use of external data sources that house your data.

How do you include external data sources??

IBM 1 个月前

#33 Is LoRA the Right Alternative to Full Fine-Tuning?

Towards AI 2 个月前

Artificial Intelligence #230

Andriy Burkov 4 个月前

Well, it’s relatively simple. You improve the prompts by having your AI workflow include steps to reach out to external data sources. These data sources can be external databases, PDF or Word documents, etc.

To include external data sets in your prompt, we need to go through the following steps:

Data Preparation

If your proprietary data is housed in a database, PDF, or Word document, you’re going to start by first extracting the data from these sources.?

Next, you’re going to store this data in a vector database.?

What’s a vector database??

A vector database will store the records of your database table or document as rows, but instead of having a scalar value as the index, the records will be indexed by vector embeddings that attempt to capture the meaning of the data in the record. We could dive into the linear algebra of the dot product of the prompt vector embedding with the record’s vector embedding, but we are trying to keep this post as simple as possible. In layman's terms, the vector database is used to find other records with similar content.

For Let’s Play Soccer, they have a database table, which contains rates for fields on specific days of the week by facility. These records are put into the vector database with a vector embedding that would be the equivalent of “renting a field at Timpanogos”.

Application Integration

Now that our field rental information is in a vector database, how do we leverage it??

When the prompt to rent a field comes in from the mobile app or website, before we communicate with the LLM, we create a vector representation of the prompt. We query the vector database with the vector representation of the prompt to extract semantically similar info from the database table that contains the field prices. Now, the responses from the vector database may not be ideal. They’ll come back with a similarity score in relationship to the prompt, so we may need to do some additional re-ranking based on Facility ID, field, day of week, and time of day.?

Next, we take the first-ranked responses from the vector database and add the response to our original prompt.? This augmented prompt now has the information from Let’s Play Soccer’s internal field price table as well as the original prompt.?

We now pass this augmented prompt to GPT-4:

We are wondering how much it would cost to rent a field for an hour weekly/bi weekly during a weekday early in the morning (around 6am) Facility: Timpanogos Indoor Soccer Field 1: Monday - Friday at 6 am mst is $50 per hour Field 2: Saturday - Sunday at 6 am mst is $100 per hour

The text above in bold is the additional content that we’ve pulled from the vector database.

The LLM is now able to use the LPS information to generate a response that better answers the original question. Lo and behold, GPT-4 with this supplementary data spits out a completion prompt that is customer-worthy.

Hi, thank you for your interest in renting a field at Timpanogos Indoor Soccer. Based on the information you provided, the cost of renting a field for an hour weekly/bi-weekly during a weekday early in the morning (around 6am) is as follows:?? If you rent Field 1 on a weekday, the cost is $50 per hour. If you rent it weekly, the total cost is $200 per month. If you rent it bi-weekly, the total cost is $100 per month.?? If you rent Field 2 on a weekend, the cost is $100 per hour. If you rent it weekly, the total cost is $400 per month. If you rent it bi-weekly, the total cost is $200 per month.You can book your field online by visiting the [Timpanogos Indoor Soccer website] and filling out the reservation form. You can also call or text the facility manager at (801) 224-2500 to check the availability and confirm your booking. I hope this helps you find the best option for your team. If you have any other questions, feel free to ask me. blush

Most startups and SMBs don’t want to spend large sums of money to train and serve a new LLM model based on their custom data. They want to leverage their existing content to build a context-aware reasoning application to answer customer inquiries using their data.

With Retrieval-Augmented Generation (RAG) you can leverage your data in AI to cost-effectively make life easier for your customers.

To learn more about how to implement RAG in your organization, click on this link.

Startup COO Club

3 个月

Fantastic insights on Retrieval-Augmented Generation. As a great mind once said - innovation distinguishes a leader from a follower. Let's harness RAG's potential together ??

Simion Stefanescu

4 个月

Joel, thanks for sharing!

John Rice

Helping executives transition from unbearable burnt out boss to zen CEO in only 6 weeks/15 mins a day

7 个月

Fascinating Joel Garcia. Thanks for the learning ??

1 次回应

Murray Gray

Supporting HR professionals to onboard, train, retain & upskill great people. Cofounder of Engagify.io & Xperiencify.com – we understand what makes people engage & actually learn.

7 个月

Joel, thanks for sharing!

1 次回应

Sayan Roy

I Help People Create and Monetize Their Brand On LinkedIn | Personal Branding Coach | LinkedIn Growth Hacker | LinkedIn Lead Generation Specialist | Project Manager @ Brainium

7 个月

Thanks for sharing, looking forward to gaining some valuable insights!

1 次回应

查看更多评论

要查看或添加评论，请登录

The ABC’s of SMS Messaging for Schools

2020年12月15日

Retrieval-Augmented Generation (RAG)for Newbies

Joel Garcia

CEO of AllCode I Need an AWS Solution Architect?

Prompt Engineering?

Retrieval Augmented Generation

领英推荐

Data Preparation

Application Integration

更多精彩文章

社区洞察

其他会员也浏览了

Artificial Intelligence #230

Artificial Intelligence #215

Artificial Intelligence #215

Artificial Intelligence #181

Artificial Intelligence #145

Artificial Intelligence #145

Artificial Intelligence #159

Artificial Intelligence #159

Llama 3.2 - Latest Llama Family LLMs

Big Windows, Better Agents (Part 6 of 10)

Prompt Engineering?

Retrieval Augmented Generation

领英推荐

Data Preparation

Application Integration

The ABC’s of SMS Messaging for Schools

2020年12月15日

社区洞察

其他会员也浏览了

Artificial Intelligence #230

Artificial Intelligence #215

Artificial Intelligence #215

Artificial Intelligence #181

Artificial Intelligence #145

Artificial Intelligence #145

Artificial Intelligence #159

Artificial Intelligence #159

Llama 3.2 - Latest Llama Family LLMs

Big Windows, Better Agents (Part 6 of 10)