Retrieval-Augmented Generation (RAG) Workflow
Jenya Stoeva
Product @ heart / No man is an island, entire of itself; Every man is a piece of the continent, a part of the main. /
Why I created my own RAG Diagram:
1. Highlight the Importance of data preparation?– To illustrate that?unstructured data?must be?prepared and embedded?before it can be queried in a RAG system.
2. Emphasize dedicated Query Tools?– To show the need for creating?dedicated tools for each document, such as:
3. Showcase LLM function calling?– To contrast the?LLM’s ability to call tools dynamically?with?simple RAG systems,?where the?orchestration framework?handles all tool invocation.
4. Lay the groundwork for Agentic Workflows?– This diagram sets the stage for?agentic workflows,?which may be illustrated in a follow-up diagram.
Key Components and Flow
1.?Documents or Data Sources
The process starts with?preparing the data. The orchestration framework?ingests?unstructured data (text, PDFs, images, etc.) and integrates structured data sources (APIs, relational DBs, etc).
This forms the?knowledge base?that the system can query later.
2.?Embedding Model & Vector Database
The embedding process allows data to be?stored and searched efficiently?in the vector database.
Next the query tools enabling e.g. search or summarisation for each document can be created. They are dedicated tools per document. For images, a?single VectorIndex?can handle?multiple images?(no need for per-image tools).
3.?User Query
A?user submits a query?in plain text (Step 5). This could be something like:
The system needs to enhance the query by retrieving relevant data.
领英推荐
4.?Embedding the User Query
The user query is passed to the?embedding model?for conversion into a vector embedding. (Step 6).
The result is an?embedded query that allows the system to perform similarity searches (Step 7).
5.?Retrieval Process (Triggered by Orchestration Layer or Function Calling)
The embedded query is passed to an LLM (Step 8):
6.?Combining Retrieved Data with Embedded Query
The retrieved data?and the embedded query are combined to form a?context-rich prompt?(Step 11). This augmented prompt contains the most relevant documents or data that match the user's query.
7.?Passing Context-rich Prompt to the LLM
The enriched prompt is sent to the?LLM (Step 12):
The final generation step ensures the output is grounded in external knowledge rather than relying solely on the LLM’s internal data.
8.?Final Output
The LLM produces the?final response, which is returned to the user. The output reflects the?combined knowledge?from the?vector DB, relational DBs, APIs and other used data sources.
Additional Notes on Orchestration and Function Calling: