登录查看更多内容

RAG (Retrieval-Augmented Generation) For Dummies. Demystifying A Key Design Pattern For Developing Enterprise AI Applications

Hassan Syed

Architect | Cloud Advisor | Azure Certified Solution Expert | Generative AI | Enterprise Systems Expert | IoT Solutions | Big Data | Digital Transformation Leader | Integration Architect | Hands-on| Mentor

发布日期: 2024年6月2日

+ 关注

RAG for Dummies - Building AI Apps with your data and power of LLMs

Disclaimer: This book cover doesn't exist (we do need one!)

My eight-year-old daughter avoids those "For Dummies" books taking to school. She thinks her classmates would make fun of her (Well, I had similar thoughts at college too :) ).

But let's be real, when it comes to foreseeing the future AI Powered Systems in our organisations, we adults today often feel like dummies ourselves.

So, let's take a step further, beyond ChatGPT, and dive with me into a popular design pattern that allows building a smart enterprise application that combines the power of AI (LLMs or SLMs), with the relevancy of your enterprise data (latest and accurate).

The Challenge: Find and Win the Construction Tenders Lightening Fast

Let's walk with me through this real use case, a simulated story of a detective but in sales.

Imagine you're a Sales Opportunity Detective trying to find

Find the latest building construction tenders released by the state Gov. in the last 7 days.

You've got your magnifying glass and detective hat on, ready to sift through piles of digital documents (scattered across websites). However, you have a secret weapon that others do not have (yet): Retrieval-Augmented Generation (RAG) Application.

In summary, your RAG app will do the three key things below:

Retrieval: Fetches data from your data sources, the internet in this case.
Augmentation: Filters and processes the data (relevant).
Generation: Creates a human readable and actionable response (by sending your query, combined with the retrieved augmented content together (as a prompt) to AI/LLM and get a humanised smart response generated.

Let's break down each step.

1. Retrieval: The Data Detective

The first step is like sending our detective out to gather clues.

Scours the internet, visiting government websites, databases, and relevant portals.
Collects all the tenders published in the last seven days.

In your organisation, for another use case, this could be extracting relevant content searching your company portals, document folders/drives, databases etc. Your RAG app will need to have the capability to use some smart search mechanism to find the relevant content in your enterprise or some app dedicated data stores.

2. Augmentation: The Clue Filter

Now that our detective has gathered a pile of clues (tenders), the next step is sorting through evidence, keeping only the relevant clues and organising them neatly. This is the augmentation phase. Here’s what happens:

RAG App reviews all the gathered data and refine it by filtering out irrelevant information.
In this case, it keeps only the tenders related to building construction.
Structures the data, making sure the important details are highlighted.

3. Generation: Getting hold of all the info. on our target

With the clues sorted, it’s time for RAG to tell the story. The generation phase is where the magic happens:

RAG takes the refined information and feeds it into a language model (LLM).
The language model then crafts a clear and detailed report, summarising the tenders.

This step is like our detective writing a final report, detailing everything clearly so you can understand it easily.

The Detective’s Report

Here’s an example of what the final report might look like:

Latest Tenders for Building Construction by New South Wales Government

Tender 1: Construction of New School Building

Description: This tender invites bids for the construction of a new school building in Sydney.

Deadline: June 15, 2024

Requirements: Experience in educational building projects, compliance with local regulations.

Contact: tender.nsw.gov.au

Tender 2: Hospital Wing Extension

Description: This tender calls for the extension of the west wing of a hospital in Newcastle.

Deadline: June 17, 2024

Requirements: Previous hospital construction experience, detailed project plan.

Contact: tender.nsw.gov.au

Summing up the system flows in this RAG Application Demo

领英推荐

Llama 3.2 vs. Closed Source Models: The Open-Source…

Analytics Insight? 5 个月前

Adapting to SAP’s AI-Focused Strategy

IgniteSAP 2 个月前

Synthetic Data Generator

developrec 1 个月前

The Technical Side

See this reference diagram from Microsoft below, gives you a good idea around the mechanics at a high level. Next you will see details on the options to implement an effective local search engine.

Source https://learn.microsoft.com/en-us/azure/search/retrieval-augmented-generation-overview

A sample technical implementation below, using OpenAI LLM. The hard part is implementing some effective local search on the enterprise data. If that part is done correctly then the other bits are normal programming challenges.

1. Document Indexing:

- Use an indexing tool like Elasticsearch, Apache Lucene, or other search technologies to index your documents. This process involves creating a searchable index of your documents that can be quickly queried.

2. Embedding-based Retrieval:

- Use embeddings to represent your documents and queries in a vector space. You can leverage pre-trained models like Sentence-BERT or other embedding models to create vector representations of your documents.

- Store these embeddings in a vector database such as FAISS, Pinecone, or Milvus.

3. Query Processing:

- When a user submits a query, convert the query into an embedding using the same model you used to embed your documents.

- Perform a similarity search in your vector database to find the most relevant documents or snippets based on the query embedding.

4. Context Augmentation:

- Retrieve the top relevant snippets or documents from your local source based on the similarity search.

- Combine the user query with these retrieved snippets to create an augmented context.

5. Sending to OpenAI:

- Send the augmented context (user query + relevant local snippets) to the OpenAI API for response generation.

- The OpenAI model can now use this enriched context to generate a more accurate and relevant response.

6. Response Delivery:

- Receive the response from OpenAI and deliver it to the user through your application interface.

### Example Workflow in Detail:

1. User Query: “How do I apply for a local business permit?”

2. Embedding Creation:

- Convert the query to an embedding using a model like Sentence-BERT.

3. Similarity Search:

- Use the query embedding to search for the top relevant document embeddings in your vector database (FAISS, Pinecone, etc.).

- Retrieve the top-k documents or snippets that are most similar to the query embedding.

4. Context Augmentation:

- Combine the query with the retrieved snippets: “User query: How do I apply for a local business permit? Retrieved snippet: To apply for a local business permit, you need to fill out form XYZ and submit it to the local council along with the required documents.”

5. Send to OpenAI:

- Send the augmented text to OpenAI: “How do I apply for a local business permit? To apply for a local business permit, you need to fill out form XYZ and submit it to the local council along with the required documents.”

6. Receive and Deliver Response:

- OpenAI generates a detailed response based on the augmented context.

- Response: “To apply for a local business permit, first, fill out form XYZ, which you can download from the local council’s website. Ensure you have all the required documents, such as proof of identity and business registration. Submit these to the local council office. You can find more details on their official website or contact their support desk.”

The Summary: Why RAG is Awesome

Here’s why it’s so valuable:

Accuracy: By pulling in current data (from your data sources), RAG ensures that you’re working with the most up-to-date information.
Relevance: The augmentation phase filters out irrelevant data, saving you time and effort.
Clarity: The generation phase (AI LLM's smartness and content humanisation powers) presents the information in a clear, readable format, making complex data easy to understand.
Cost Optimisation: By filtering and passing on only limited filtered data we can reduce the LLM costs
Scalability: Following this pattern the model could quickly be scaled up and enhanced by adding more data sources.

So, there is no doubt we are going to see a large number of RAG applications appearing on the enterprise horizon. Therefore, the sooner we start in this space, the more beneficial it will be for your orgnaisation against the never ending market competition.

要查看或添加评论，请登录

Hassan Syed的更多文章

AI Showdown: ChatGPT vs. Grok-3 vs. DeepSeek vs. Claude – Who Makes the Best Charts?

2025年3月6日

AI Showdown: ChatGPT vs. Grok-3 vs. DeepSeek vs. Claude – Who Makes the Best Charts?

I recently tested ChatGPT, Grok-3, DeepSeek, and Claude on data visualisation by asking them to generate tables and…
Comparing AI-Generated Spaceflight Animations: Grok 3 vs. DeepSeek, ChatGPT, and Claude AI

2025年2月22日

Comparing AI-Generated Spaceflight Animations: Grok 3 vs. DeepSeek, ChatGPT, and Claude AI

Introduction Elon Musk's AI company, xAI, recently launched Grok 3, showcasing a demonstration of a spacecraft's…

2 条评论
Revolutionising Customer Support with AI & NLP

2024年7月21日

Revolutionising Customer Support with AI & NLP

The Daily Struggles Without 'Human-Like' AI Picture this: You're a customer support representative, starting your day…
Chain-of-Thought, Tree-of-Thought, and Graph-of-Thought Prompting for Health Triage at Home - A custom GPT App

2024年5月28日

Chain-of-Thought, Tree-of-Thought, and Graph-of-Thought Prompting for Health Triage at Home - A custom GPT App

When faced with a health issue at home, it’s crucial to decide whether to visit a doctor, consult a pharmacist, or…
Birbal - Your Trusted Advisor GPT - Powered by GPT-4o

2024年5月25日

Birbal - Your Trusted Advisor GPT - Powered by GPT-4o

King Akbar: Birbal, find the most valuable thing in the world by sunset. Birbal: As you wish, Your Majesty.

1 条评论
Great Minds Lounge: Want to tap into Gates, Musk and Jobs' minds? Check out this GPT App

2024年5月11日

Great Minds Lounge: Want to tap into Gates, Musk and Jobs' minds? Check out this GPT App

After spawning Elon Musk in the previous app, in this new ChatGPT 4 App 'Great Minds Lounge' you can meet the great…

1 条评论
Elon Musk - Be My Guest : How AI Personas Can Improve Productivity in the Workplace

2024年5月8日

Elon Musk - Be My Guest : How AI Personas Can Improve Productivity in the Workplace

Deepfake Personas Beyond deepfake images, AI is poised to transform the enterprise landscape, where it can mimic and…
Delta Lake :The Time Traveller's Data Guide

2024年5月5日

Delta Lake :The Time Traveller's Data Guide

I often find myself navigating the choppy waters of big data, helping folks keep their data clean, consistent, and…
Beyond the Hype: Real-World Application of ChatGPT-4 with my BYD Seal Assistant

2024年5月1日

Beyond the Hype: Real-World Application of ChatGPT-4 with my BYD Seal Assistant

When you have some free time at your hands, you have got some new tech toy (an EV), you are diving deep into some smart…

2 条评论
FlyView 3.0 is here - Using SharePoint is a lot more easier now

2018年9月10日

FlyView 3.0 is here - Using SharePoint is a lot more easier now

I am glad to announce the release of FlyView for SharePoint 3.0 with SharePoint Modern Look support and advanced site…

See all articles

RAG (Retrieval-Augmented Generation) For Dummies. Demystifying A Key Design Pattern For Developing Enterprise AI Applications

Hassan Syed

Architect | Cloud Advisor | Azure Certified Solution Expert | Generative AI | Enterprise Systems Expert | IoT Solutions | Big Data | Digital Transformation Leader | Integration Architect | Hands-on| Mentor

RAG for Dummies - Building AI Apps with your data and power of LLMs

The Challenge: Find and Win the Construction Tenders Lightening Fast

1. Retrieval: The Data Detective

2. Augmentation: The Clue Filter

3. Generation: Getting hold of all the info. on our target

The Detective’s Report

领英推荐

The Technical Side

The Summary: Why RAG is Awesome

Hassan Syed的更多文章

社区洞察

其他会员也浏览了

What’s RAG (and why should enterprise leaders care)?

Educate readers about pitfalls in Data Labeling, from poor planning to non-compliance issues.

Customizing Secure Datasets for RAG & LM Development

The Data Labeling Software Market is projected to reach a market size of USD 11.72 Billion by the end of 2030

TOP AI TOOLS FOR SPREADSHEETS

Appsmith AI, Strapi Story, Supabase Contest, and more! ??

Monetizing Zero-Party Data for Training AI (Using Snowflake)

AI Integration: Boosting MS Access Web Apps with AI

Bringing AI to the Data #14: Open Source AI

Overcoming Generative AI Challenges: Privacy and AI 'Hallucinations' at the Forefront of Salesforce's New Offering

RAG for Dummies - Building AI Apps with your data and power of LLMs

The Challenge: Find and Win the Construction Tenders Lightening Fast

1. Retrieval: The Data Detective

2. Augmentation: The Clue Filter

3. Generation: Getting hold of all the info. on our target

The Detective’s Report

领英推荐

The Technical Side

The Summary: Why RAG is Awesome

Hassan Syed的更多文章

AI Showdown: ChatGPT vs. Grok-3 vs. DeepSeek vs. Claude – Who Makes the Best Charts?

Comparing AI-Generated Spaceflight Animations: Grok 3 vs. DeepSeek, ChatGPT, and Claude AI

Revolutionising Customer Support with AI & NLP

Chain-of-Thought, Tree-of-Thought, and Graph-of-Thought Prompting for Health Triage at Home - A custom GPT App

Birbal - Your Trusted Advisor GPT - Powered by GPT-4o

Great Minds Lounge: Want to tap into Gates, Musk and Jobs' minds? Check out this GPT App

Elon Musk - Be My Guest : How AI Personas Can Improve Productivity in the Workplace

Delta Lake :The Time Traveller's Data Guide

Beyond the Hype: Real-World Application of ChatGPT-4 with my BYD Seal Assistant

FlyView 3.0 is here - Using SharePoint is a lot more easier now

社区洞察

其他会员也浏览了

What’s RAG (and why should enterprise leaders care)?

Educate readers about pitfalls in Data Labeling, from poor planning to non-compliance issues.

Customizing Secure Datasets for RAG & LM Development

The Data Labeling Software Market is projected to reach a market size of USD 11.72 Billion by the end of 2030

TOP AI TOOLS FOR SPREADSHEETS

Appsmith AI, Strapi Story, Supabase Contest, and more! ??

Monetizing Zero-Party Data for Training AI (Using Snowflake)

AI Integration: Boosting MS Access Web Apps with AI

Bringing AI to the Data #14: Open Source AI

Overcoming Generative AI Challenges: Privacy and AI 'Hallucinations' at the Forefront of Salesforce's New Offering