登录查看更多内容

Day 1: Introduction to Retrieval Augmented Generation

Himanshu Singh

AI Architect | Author | MBA - NMIMS

发布日期: 2024年1月30日

+ 关注

This is part of the series?—?10 days of Retrieval Augmented Generation

Before we start our first day, let us have a look at what lies ahead in this 10 days series:

Day 1: Introduction to Retrieval Augmented Generation (*)
Day 2: Understanding core components of RAG pipeline
Day 3: Building our First RAG
Day 4: Packaging our RAG using Streamlit and Chainlit
Day 5: Creating RAG assitant with Memory
Day 6: Building complete RAG pipeline in Azure
Day 7: Building complete RAG pipeline in AWS
Day 8: Evaluation and benchmarking RAG systems
Day 9: End to End Project 1 on RAG (Real World) with React JS frontend
Day 10: End to End Project 2 on RAG (Real World) with React JS frontend

Now, lets continue with our Day 1 - Introduction to Retrieval Augmented Generation.

Introduction to RAG

Before jumping to RAG, let’s take a look at two scenarios,

Scenario 1

Imagine there’s an open book exam. You read the question asked and search through the book to find the right answer. You adopt different strategies, like skimming through the pages, or going to a specific chapter and then looking for the answer, etc.

What if, there was a digital agent in your possession to which when you ask the question, it reads the book and fetches the best answer for you?

Scenario 2

Now Imagine you’re in a library. There are thousands of books present. Again, suppose, you have a question. But this time you don’t know which book has the answer, where the book is present, and even inside the book you don’t know where the answer would be. If you start searching, it may take you hours, or even days, to find the right answer.

领英推荐

Google's AI Game Engine, Cursor: The AI IDE, 3 R's in…

HackerRank 6 个月前

The elevation of human work

Reid Hoffman 2 年前

Transforming Ideas into Reality: How AI Fuels My…

Cohen Reuven 5 个月前

But, if our digital assistant is present and you ask the same question to it, then all the headache of searching the answer is given to it and you just wait, have a sip of coffee, to get the answer. This digital assistant gets the right answer for you, and that too in minutes.

The digital assistant that we talk about in both the scenarios, in the field of Generative AI, is powered by the concept of Retrieval Augmented Generation (RAG). Let us understand the steps taken by RAG to give us the right answer,

RAG step by step

The first step is to create the repository of documents. Now this is the place where if you have a book (scenario 1) then repository will be having a single book. But if you talk about scenario 2 then the repository will be entire library. This repository is not created just by placing all the documents at one place. There are various approaches of Chunking, Embeddings etc. are required, which we will discuss later. But as of now, knowledge repository is the first step of RAG.
The user asks the question now. This is called as Prompt. This question is received by RAG to get the respose from the repository.
Once the question is received, RAG searches for the most relevant answer to the question, by going through all the documents present in the repository. This answer is extracted using various mathematical concepts like cosine similarity or maximum marginal relevance, etc. which we will discuss later.
The output is summarized and then given back to the user. This summarization happens using LLM (Large Language Models) present. The most famous one, currently, being GPT4 but there are a lot of others out there.
Additional Step: Sometimes there are a lot of documents having similar kinds of answer to the question asked. This time RAG gets confused about which may be the right answer. In this situation, RAG retrieves the top n answer (may be 5) and then the question and the 5 answeres are sent back to LLM (GPT4). LLM reads the question and the answers (aka context) and then finally gives the correct answer.

Important LLM Models

It must be understood that that the core of RAG is LLM models. They are responsible for generating contextual responses to the questions asked. Given below is the list of top 10 models that’s used in the industry currently.

GPT 4 by Open AI
PaLM 2 by Google AI
Claude v1 by Antropic AI
LlaMA by Meta AI
Mistral by Facebook AI Research
Jurrasic 1 by AI21 Labs
Flan-T5 – Google and Open Source
Megatron-Turing NLG by NVIDIA and Microsoft
Cohere by Cohere AI
BLOOM by Big Science

Important Indexes

To create the knowledge repository for RAG, indexes are used. We will talk about them in more detail later but lets list some of them.

FAISS – Vector DB
Quadrant – Vector DB
Pinecone – Vector DB
Chroma – Vector DB
Neo4J – Knowledge Graph
Azure Search Index
Amazon Kendra

As an overview, we chunk the document and create embeddings of them. These embeddings are stored in one of the above services showed above. Then we use different similarity measures to get the answers.

This finishes our first day discussion of RAG. Tomorrow we will look at the core components of RAG. We will look at chunking and embeddings, what we mean by prompts, what are vector indexes, different LLM frameworks, etc.

Mohamed Hassan

Software Engineer @ EJADA

1 年

Your example is great, thank you for your clarification ?

Eswaraprasad Ketha

Senior Lead Software Engineer - L6 | M.Tech in Artificial Intelligence

1 年

Great beginning in grasping workflow of RAG with generative models. Eagerly anticipating the upcoming sections.

Piotr Malicki

1 年

This is an impressive initiative! Looking forward to diving into the series.

Aravind Kota

Data Scientist | 8+ Years Experience | Expert in Computer Vision & Deep Learning Research and Product Development | Skilled in NLP, LLMs, VLMs & Cloud-Based AI Applications | Actively Seeking New Opportunities

1 年

Good start to understand GenAI and RAG components. Looking forward to remaining parts.

查看更多评论

要查看或添加评论，请登录

Himanshu Singh的更多文章

Day 7: Building complete RAG pipeline in AWS

2024年2月13日

Day 7: Building complete RAG pipeline in AWS

This is part of the series?—?10 days of Retrieval Augmented Generation Before we start our seventh day, let us have a…
Day 6: Building complete RAG pipeline in Azure

2024年2月9日

Day 6: Building complete RAG pipeline in Azure

This is part of the series?—?10 days of Retrieval Augmented Generation Before we start our sixth day, let us have a…

3 条评论
Day 5: Creating RAG assitant with Memory

2024年2月5日

Day 5: Creating RAG assitant with Memory

This is part of the series?—?10 days of Retrieval Augmented Generation Before we start our fifth day, let us have a…

2 条评论
Day 4: Building Multi-documents RAG and packaging using Streamlit

2024年2月4日

Day 4: Building Multi-documents RAG and packaging using Streamlit

This is part of the series?—?10 days of Retrieval Augmented Generation Before we start our fourth day, let us have a…
Day 3: Building our First RAG

2024年2月1日

Day 3: Building our First RAG

This is part of the series?—?10 days of Retrieval Augmented Generation Before we start our third day, let us have a…
Day 2: Understanding core components of RAG pipeline

2024年1月31日

Day 2: Understanding core components of RAG pipeline

This is part of the series?—?10 days of Retrieval Augmented Generation Before we start our second day, let us have a…

3 条评论
Balancing Client Expectations, Vendor Interests, and Employee Growth in the IT Services Industry

2023年5月22日

Balancing Client Expectations, Vendor Interests, and Employee Growth in the IT Services Industry

In the IT services industry, a common dynamic unfolds: clients want the best possible service at the lowest possible…

1 条评论
The Data Science Grandmaster Course

2019年12月16日

The Data Science Grandmaster Course

I have been hearing this complaint from almost all my students that whatever they are learning in the institutes they…

4 条评论
AI Leadership - No it's not the same!

2019年9月5日

AI Leadership - No it's not the same!

From the last few years if you ask anyone about the most booming sector in the industry, AI will always lead the list…

4 条评论
Data Scientists in India - Service or Research Oriented?

2018年7月13日

Data Scientists in India - Service or Research Oriented?

Whenever I meet head figures in different technologies, I hear only one opinion about Data Science from them, "India is…

1 条评论

See all articles

Day 1: Introduction to Retrieval Augmented Generation

Himanshu Singh

AI Architect | Author | MBA - NMIMS

This is part of the series?—?10 days of Retrieval Augmented Generation

Introduction to RAG

Scenario 1

Scenario 2

领英推荐

RAG step by step

Important LLM Models

Important Indexes

Himanshu Singh的更多文章

社区洞察

其他会员也浏览了

AI-Powered Search: Embedding-Based Retrieval and Retrieval-Augmented Generation (RAG)

Top 14 AI Web & Data Scrapers in 2025: Extract Online Data

Computer Using AI Agents (CUAs) Are Replacing Humans: How OpenAI's ‘Operator,’ Google's Mariner, and Anthropic's Claude Are Taking Over Digital Work

SuperMap Copilot Preview: Experience the New Interactive Form of Spatial Intelligence (Ⅰ)

??? Industry Bytes: Gemini 2.0, Anthropic’s New AI Model, & 2025 Scraping Trends

GenAI Weekly — Edition 34

Integrating AI & ML with .NET Applications: A Complete Guide

Hugging Face: Your Gateway to LLMs

A Guide to Building Llama 3.1 RAG Applications with TIR AI Studio

RAG Pipelines with Visual Embeddings

This is part of the series?—?10 days of Retrieval Augmented Generation

Introduction to RAG

Scenario 1

Scenario 2

领英推荐

RAG step by step

Important LLM Models

Important Indexes

Himanshu Singh的更多文章

Day 7: Building complete RAG pipeline in AWS

Day 6: Building complete RAG pipeline in Azure

Day 5: Creating RAG assitant with Memory

Day 4: Building Multi-documents RAG and packaging using Streamlit

Day 3: Building our First RAG

Day 2: Understanding core components of RAG pipeline

Balancing Client Expectations, Vendor Interests, and Employee Growth in the IT Services Industry

The Data Science Grandmaster Course

AI Leadership - No it's not the same!

Data Scientists in India - Service or Research Oriented?

社区洞察

其他会员也浏览了

AI-Powered Search: Embedding-Based Retrieval and Retrieval-Augmented Generation (RAG)

Top 14 AI Web & Data Scrapers in 2025: Extract Online Data

Computer Using AI Agents (CUAs) Are Replacing Humans: How OpenAI's ‘Operator,’ Google's Mariner, and Anthropic's Claude Are Taking Over Digital Work

SuperMap Copilot Preview: Experience the New Interactive Form of Spatial Intelligence (Ⅰ)

??? Industry Bytes: Gemini 2.0, Anthropic’s New AI Model, & 2025 Scraping Trends

GenAI Weekly — Edition 34

Integrating AI & ML with .NET Applications: A Complete Guide

Hugging Face: Your Gateway to LLMs

A Guide to Building Llama 3.1 RAG Applications with TIR AI Studio

RAG Pipelines with Visual Embeddings