Agentic RAG Using CrewAI & LangChain!
Image Credits: Pavan Belagatti

Agentic RAG Using CrewAI & LangChain!

In the rapidly evolving field of artificial intelligence, Agentic RAG has emerged as a game-changing approach to information retrieval and generation. This advanced technique combines the power of Retrieval Augmented Generation (RAG) with autonomous agents, offering a more dynamic and context-aware method to process and generate information. As businesses and researchers seek to enhance their AI capabilities, understanding and implementing Agentic RAG has become crucial to staying ahead in the competitive landscape.

This guide delves into the intricacies of mastering Agentic RAG using two powerful tools: LangChain and CrewAI. It explores the evolution from traditional RAG to its agentic counterpart, highlighting the key differences and benefits. The article also examines how LangChain serves as the foundation for implementing Agentic RAG and demonstrates the ways CrewAI can be leveraged to create more sophisticated and efficient AI systems.

The Evolution of RAG: From Traditional to Agentic

Limitations of traditional RAG


Traditional Retrieval Augmented Generation (RAG) systems have revolutionized AI by combining Large Language Models (LLMs) with vector databases to overcome off-the-shelf LLM limitations . However, these systems face challenges while multi-tasking and are not suitable for complex use cases. It is okay until you are building simple Q&A chatbot, support bots, etc but as soon the things get a little complex, the traditional RAG approach fails. They often struggle with contextualizing retrieved data, leading to superficial responses that may not fully address query nuances.

Introducing Agentic RAG

Agentic RAG emerges as an evolution of traditional RAG, integrating AI agents to enhance the RAG approach. This approach employs autonomous agents to analyze initial findings and strategically select effective tools for data retrieval . These AI agents have the capability to breakdown the complex task into several subtasks so it becomes easy to handle. They also possess the memory (like chat history) so they know what has happened and what steps needs to be taken further. Also, these AI agents are so smart they can call any API or tool whenever there is a requirement to solve the tasks. The agents can come up with logic, reasoning and take actions accordingly. This is what makes an agentic RAG approach so prominent. The system deconstructs complex queries into manageable segments, assigning specific agents to each part while maintaining seamless coordination.

Key benefits and use cases of Agentic RAG

Agentic RAG offers numerous advantages over traditional systems. Its autonomous agents work independently, allowing for efficient handling of complex queries in parallel. The system’s adaptability enables dynamic adjustment of strategies based on new information or evolving user needs. In marketing, Agentic RAG can analyze customer data to generate personalized communications and provide real-time competitive intelligence. It also enhances decision-making in campaign management and improves search engine optimization strategies.

LangChain: The Backbone of Agentic RAG

Overview of LangChain

LangChain has emerged as a powerful framework for building Large Language Model (LLM) applications, showing exponential growth in its capabilities. It serves as a versatile tool, offering greater compatibility with various platforms compared to other frameworks. At its core, LangChain integrates cutting-edge technologies to enhance model performance with each interaction. The framework operates on a modular principle, allowing for flexibility and adaptability in processing natural language interactions.

Essential components for Agentic RAG

LangChain’s architecture supports both short-term and long-term memory capabilities, crucial for Agentic RAG systems. Short-term memory utilizes in-context learning, while long-term memory leverages external vector stores for infinite information retention and fast retrieval. These components enable LangChain to excel in understanding context, tone, and nuances within conversations, leading to more human-like interactions.

Integrating LangChain with external tools

To implement Agentic RAG, LangChain can be integrated with various external tools. This integration introduces intelligent agents that can plan, reason, and learn over time. The system typically includes document agents for question answering and summarization, and a meta-agent to oversee and coordinate their efforts. This hierarchical structure enhances capabilities in tasks requiring strategic planning and nuanced decision-making, elevating the agent’s performance to new heights.

Leveraging CrewAI for Advanced Agentic RAG

Introduction to CrewAI

CrewAI is an open-source framework designed to create and manage teams of intelligent agents . Unlike traditional chatbots, these agents can collaborate and share information, tackling complex tasks together. CrewAI serves as a sophisticated platform that empowers organizations to structure their AI operations effectively, simulating software development team roles and responsibilities.

Implementing multi-agent workflows

CrewAI facilitates multi-agent workflows by allowing users to define tasks, roles, goals, and backstories for agents. This approach enhances productivity, decision-making processes, and product design within organizations. The framework supports various collaboration models , including sequential, hierarchical, and asynchronous workflows. By leveraging CrewAI, teams can streamline operations and maximize efficiency through coordinated efforts.

Optimizing agent interactions and decision-making

CrewAI optimizes agent interactions through features like role-playing, focus maintenance, and tool utilization. The platform incorporates guardrails for safety measures and protocols, ensuring reliable and ethical operations. Memory capabilities enable agents to store and recall past interactions, enhancing decision-making processes. By integrating CrewAI with advanced language models like Groq’s Llama3–70B , organizations can further improve content generation and task performance.

Agentic RAG Workflow Tutorial

We are going to see how agents can be involved in the RAG system to retrieve the most relevant information by calling tools.

I’ll be using SingleStore Notebooks (just like your Google colab or Jupyter Notebooks but with added features) to run my code. You can also use the same. SingleStore has a free shared tier, you can sign up and start using the services for free.

Sign up now and get started with your notebook .

Once you create your SingleStore notebook, let’s keep adding the below code and run it in a step-by-step manner.


Install the required libraries


!pip install crewai==0.28.8 crewai_tools==0.1.6 langchain_community==0.0.29 sentence-transformers langchain-groq --quiet        
from langchain_openai import ChatOpenAI
import os
from crewai_tools import PDFSearchTool
from langchain_community.tools.tavily_search import TavilySearchResults
from crewai_tools  import tool
from crewai import Crew
from crewai import Task
from crewai import Agent        

Mention the Groq API Key

import os

# Set the API key
os.environ['GROQ_API_KEY'] = 'Add Your Groq API Key'        

Mention the LLM being used

llm = ChatOpenAI(
    openai_api_base="https://api.groq.com/openai/v1",
    openai_api_key=os.environ['GROQ_API_KEY'],
    model_name="llama3-8b-8192",
    temperature=0.1,
    max_tokens=1000,
)        

Load your data/custom data you would like to use. I am using a publicly available pdf on ‘attention is all you need’.

import requests

pdf_url = 'https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf'
response = requests.get(pdf_url)

with open('attenstion_is_all_you_need.pdf', 'wb') as file:
    file.write(response.content)        

Create a RAG tool variable to pass our PDF

rag_tool = PDFSearchTool(pdf='attenstion_is_all_you_need.pdf',
    config=dict(
        llm=dict(
            provider="groq", # or google, openai, anthropic, llama2, ...
            config=dict(
                model="llama3-8b-8192",
                # temperature=0.5,
                # top_p=1,
                # stream=true,
            ),
        ),
        embedder=dict(
            provider="huggingface", # or openai, ollama, ...
            config=dict(
                model="BAAI/bge-small-en-v1.5",
                #task_type="retrieval_document",
                # title="Embeddings",
            ),
        ),
    )
)        
rag_tool.run("How did self-attention mechanism evolve in large language models?")        

We will be using Tavily as a tool for our agents to use.

Tavily Search API is a search engine optimized for LLMs and RAG, aimed at efficient, quick and persistent search results. So, let’s mention the Tavily API Key & set it up.

import os

# Set the Tavily API key
os.environ['TAVILY_API_KEY'] = 'Add Your Tavily API Key'        
web_search_tool = TavilySearchResults(k=3)        
web_search_tool.run("What is self-attention mechansim in large language models?")        

Let’s define a tool

@tool
def router_tool(question):
  """Router Function"""
  if 'self-attention' in question:
    return 'vectorstore'
  else:
    return 'web_search'        

Create agents to work with

Router_Agent = Agent(
  role='Router',
  goal='Route user question to a vectorstore or web search',
  backstory=(
    "You are an expert at routing a user question to a vectorstore or web search."
    "Use the vectorstore for questions on concept related to Retrieval-Augmented Generation."
    "You do not need to be stringent with the keywords in the question related to these topics. Otherwise, use web-search."
  ),
  verbose=True,
  allow_delegation=False,
  llm=llm,
)        
Retriever_Agent = Agent(
role="Retriever",
goal="Use the information retrieved from the vectorstore to answer the question",
backstory=(
    "You are an assistant for question-answering tasks."
    "Use the information present in the retrieved context to answer the question."
    "You have to provide a clear concise answer."
),
verbose=True,
allow_delegation=False,
llm=llm,
)        
Grader_agent =  Agent(
  role='Answer Grader',
  goal='Filter out erroneous retrievals',
  backstory=(
    "You are a grader assessing relevance of a retrieved document to a user question."
    "If the document contains keywords related to the user question, grade it as relevant."
    "It does not need to be a stringent test.You have to make sure that the answer is relevant to the question."
  ),
  verbose=True,
  allow_delegation=False,
  llm=llm,
)        
hallucination_grader = Agent(
    role="Hallucination Grader",
    goal="Filter out hallucination",
    backstory=(
        "You are a hallucination grader assessing whether an answer is grounded in / supported by a set of facts."
        "Make sure you meticulously review the answer and check if the response provided is in alignmnet with the question asked"
    ),
    verbose=True,
    allow_delegation=False,
    llm=llm,
)        
answer_grader = Agent(
    role="Answer Grader",
    goal="Filter out hallucination from the answer.",
    backstory=(
        "You are a grader assessing whether an answer is useful to resolve a question."
        "Make sure you meticulously review the answer and check if it makes sense for the question asked"
        "If the answer is relevant generate a clear and concise response."
        "If the answer gnerated is not relevant then perform a websearch using 'web_search_tool'"
    ),
    verbose=True,
    allow_delegation=False,
    llm=llm,
)        

Define tasks for these agents

router_task = Task(
    description=("Analyse the keywords in the question {question}"
    "Based on the keywords decide whether it is eligible for a vectorstore search or a web search."
    "Return a single word 'vectorstore' if it is eligible for vectorstore search."
    "Return a single word 'websearch' if it is eligible for web search."
    "Do not provide any other premable or explaination."
    ),
    expected_output=("Give a binary choice 'websearch' or 'vectorstore' based on the question"
    "Do not provide any other premable or explaination."),
    agent=Router_Agent,
    tools=[router_tool],
)        
retriever_task = Task(
    description=("Based on the response from the router task extract information for the question {question} with the help of the respective tool."
    "Use the web_serach_tool to retrieve information from the web in case the router task output is 'websearch'."
    "Use the rag_tool to retrieve information from the vectorstore in case the router task output is 'vectorstore'."
    ),
    expected_output=("You should analyse the output of the 'router_task'"
    "If the response is 'websearch' then use the web_search_tool to retrieve information from the web."
    "If the response is 'vectorstore' then use the rag_tool to retrieve information from the vectorstore."
    "Return a claer and consise text as response."),
    agent=Retriever_Agent,
    context=[router_task],
   #tools=[retriever_tool],
)        
grader_task = Task(
    description=("Based on the response from the retriever task for the quetion {question} evaluate whether the retrieved content is relevant to the question."
    ),
    expected_output=("Binary score 'yes' or 'no' score to indicate whether the document is relevant to the question"
    "You must answer 'yes' if the response from the 'retriever_task' is in alignment with the question asked."
    "You must answer 'no' if the response from the 'retriever_task' is not in alignment with the question asked."
    "Do not provide any preamble or explanations except for 'yes' or 'no'."),
    agent=Grader_agent,
    context=[retriever_task],
)        
hallucination_task = Task(
    description=("Based on the response from the grader task for the quetion {question} evaluate whether the answer is grounded in / supported by a set of facts."),
    expected_output=("Binary score 'yes' or 'no' score to indicate whether the answer is sync with the question asked"
    "Respond 'yes' if the answer is in useful and contains fact about the question asked."
    "Respond 'no' if the answer is not useful and does not contains fact about the question asked."
    "Do not provide any preamble or explanations except for 'yes' or 'no'."),
    agent=hallucination_grader,
    context=[grader_task],
)

answer_task = Task(
    description=("Based on the response from the hallucination task for the quetion {question} evaluate whether the answer is useful to resolve the question."
    "If the answer is 'yes' return a clear and concise answer."
    "If the answer is 'no' then perform a 'websearch' and return the response"),
    expected_output=("Return a clear and concise response if the response from 'hallucination_task' is 'yes'."
    "Perform a web search using 'web_search_tool' and return ta clear and concise response only if the response from 'hallucination_task' is 'no'."
    "Otherwise respond as 'Sorry! unable to find a valid response'."),
    context=[hallucination_task],
    agent=answer_grader,
    #tools=[answer_grader_tool],
)        

Define the flow for our usecase

rag_crew = Crew(
    agents=[Router_Agent, Retriever_Agent, Grader_agent, hallucination_grader, answer_grader],
    tasks=[router_task, retriever_task, grader_task, hallucination_task, answer_task],
    verbose=True,

)        

Ask the query

inputs ={"question":"How does self-attention mechanism help large language models?"}        

Kick off the pipeline

result = rag_crew.kickoff(inputs=inputs)        

You should see the magic of these agents to get the final output/response.


You can see the final proper response by running

print(result)        


Access the complete notebook code here:)

The journey through Agentic RAG, LangChain, and CrewAI showcases the groundbreaking advancements in AI-powered information processing and generation. These tools have a significant impact on how businesses and researchers tackle complex queries and make data-driven decisions. By combining the strengths of autonomous agents, flexible frameworks, and collaborative AI teams, organizations can now handle intricate RAG related tasks with greater efficiency and accuracy.


I have used SingleStore Notebooks to run this code. Sign up to SingleStore and get started for free .

Guillaume Verdier

Eleveur d'Algorithmes

3 周

"Agentic RAG?exhibits a sense of agency, enabling it to take initiative in conversations rather than merely responding to prompts. This agentic quality empowers agents to engage users dynamically, fostering more immersive interactions" excellent

回复
Danish Jalal

Python | Node JS | PHP

1 个月

can any one help me how use chromadb in crewai with retriever

回复
shyam sunder

Principal Director, Accenture Technology Consulting

2 个月

Excellent insights Pavan. In my view we should also scale AgenticRAG framework with automated human feedback loop, something that I would coin as RLHFOps

回复
Subranjit Sahoo

Gen AI for Pharma @Chryselys

2 个月

Thanks for sharing

回复
Erfan Shafiee Moghadam

Developer in Machine Learning and NLP | Passionate about AI Innovation

3 个月

Fantastic post, Pavan Belagatti!?? The evolution from traditional RAG to Agentic RAG is truly exciting. ?? Your detailed explanation of how LangChain and CrewAI elevate AI capabilities through autonomous agents is impressive. This advancement promises to enhance the efficiency and accuracy of information retrieval like never before. ???? Thanks for sharing this insightful guide! ??

要查看或添加评论,请登录

Pavan Belagatti的更多文章

社区洞察

其他会员也浏览了