Implement Agentic RAG - The NextGen Intelligent Systems

Implement Agentic RAG - The NextGen Intelligent Systems

In the ever-evolving landscape of artificial intelligence, a new paradigm is emerging—one that shifts from passive, query-driven models to proactive, decision-making entities. This transformation is embodied in Agentic AI, an advanced form of AI that autonomously plans, decides, and executes actions, bridging the gap between static machine intelligence and dynamic human cognition.

The Evolution of AI: From Reactive to Agentic

Traditionally, AI systems have functioned as reactive tools, responding to queries and commands with pre-trained knowledge. These models, including retrieval-augmented generation (RAG) approaches, rely heavily on searching vast corpora of data and generating responses based on existing knowledge. While effective, they lack the capability to self-direct or adapt beyond what they have been explicitly trained on.

Agentic AI, however, takes a giant leap forward. Instead of merely retrieving and generating responses, these systems are designed to:

  1. Understand Context: They process complex inputs, discern intent, and formulate multi-step plans.
  2. Make Decisions: Through reinforcement learning and optimization, they evaluate multiple pathways and select the most optimal course of action.
  3. Execute Actions Autonomously: They initiate tasks without direct human intervention, much like an intelligent assistant capable of independent problem-solving.
  4. Iterate and Learn: By leveraging feedback loops, they refine their decision-making process over time.

How Agentic AI Works: A Python Implementation

To illustrate Agentic AI in action, let’s examine a Python-based framework that integrates retrieval-augmented generation (RAG) with a decision-making agent. The system employs FAISS for vector-based retrieval, Llama models for natural language generation, and a custom agent class that determines when retrieval is necessary versus direct generation.

Step 1: Building a Knowledge Retrieval System

Using FAISS, we create a vector store of pre-embedded knowledge. Documents are encoded via Sentence Transformers, allowing efficient similarity searches.

import faiss
import numpy as np
from sentence_transformers import SentenceTransformer
import json

# Load embedding model
embedding_model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")

# Sample documents
documents = [
    "Jamsetji Tata's vision laid the foundation for India's industrial revolution.",
    "The Tata group has pioneered industries like steel, aviation, and IT.",
    "The Tata Trusts have contributed significantly to education and healthcare.",
]

# Generate embeddings
embeddings = np.array(embedding_model.encode(documents), dtype=np.float32)

# Create FAISS index
dimension = embeddings.shape[1]
index = faiss.IndexFlatL2(dimension)
index.add(embeddings)

# Save index and documents
faiss.write_index(index, "vector_store.index")
with open("doc_map.json", "w") as f:
    json.dump(documents, f)
        

Step 2: Implementing Retrieval-Augmented Generation (RAG)

With FAISS storing our knowledge base, we retrieve the most relevant documents for a given query and augment an LLM’s response.

def retrieve_relevant_documents(query, k=3):
    """Retrieve top-k relevant documents for a query using FAISS"""
    query_embedding = np.array(embedding_model.encode([query]), dtype=np.float32)
    distances, indices = index.search(query_embedding, k)
    
    with open("doc_map.json", "r") as f:
        document_list = json.load(f)
    
    return [document_list[i] for i in indices[0]]
        

Step 3: Integrating a Large Language Model (LLM)

We load a DeepSeek LLM to generate responses based on the retrieved documents.

from llama_cpp import Llama

# Load on-prem model
llm = Llama(model_path="C:/models/deepseek-llm-7b-base.Q8_0.gguf")

def generate_response(query):
    """Generate a response using retrieved context and the LLM"""
    retrieved_docs = retrieve_relevant_documents(query)
    context = "\n".join(retrieved_docs)
    
    prompt = f"""
    You are an AI agent using Retrieval-Augmented Generation (RAG).
    Answer the query using the following retrieved documents:

    {context}

    Query: {query}
    Answer:
    """

    response = llm(prompt, max_tokens=300)
    return response["choices"][0]["text"]
        

Step 4: Creating an Agent for Decision-Making

Rather than retrieving knowledge for every query, we implement an Agent that decides whether to rely on retrieval or generate a response independently.

class Agent:
    """Custom agent to decide whether to retrieve, generate, or refine responses"""

    def __init__(self, llm):
        self.llm = llm

    def decide_action(self, query):
        """Decide if retrieval is necessary or if LLM alone can answer"""
        prompt = f"""
        Determine if the query requires external retrieval.
        Respond with 'retrieve' if knowledge from documents is needed, otherwise 'generate':

        Query: {query}
        Answer:
        """
        response = self.llm(prompt, max_tokens=10)["choices"][0]["text"].strip().lower()
        return response

    def execute(self, query):
        """Execute the best approach based on decision"""
        action = self.decide_action(query)

        if "retrieve" in action:
            return generate_response(query)
        else:
            return self.llm(query, max_tokens=300)["choices"][0]["text"]

# Initialize agent
agent = Agent(llm)

# Example agent decision
query = "Who founded Tata Steel?"
response = agent.execute(query)
print(response)
        

The Future of Agentic AI

Agentic AI holds immense potential across industries:

  • Healthcare: AI-driven agents could proactively monitor patient conditions and suggest personalized treatments.
  • Finance: Autonomous trading agents could analyze markets and execute real-time strategies.
  • Legal & Compliance: AI could autonomously research regulations and generate compliance reports.
  • Education: Adaptive learning systems could tailor curricula based on individual student needs.

Conclusion

The transition from static AI models to Agentic AI marks a significant evolution in intelligent systems. With the ability to autonomously retrieve, generate, decide, and execute, these agents promise to revolutionize how AI interacts with and augments human capabilities.

As we step into this new frontier, the challenge lies in balancing autonomy with control, ensuring that these agents remain aligned with human values, objectives, and ethical considerations. The future of AI is not just about intelligence—it’s about agency.


Appendix

Here the code for downloading the pretrained model for onprem.LLM.

from huggingface_hub import hf_hub_download,HfApi

import os

# Security note: Never hardcode tokens! Use environment variables instead

hf_token = os.getenv("HF_TOKEN", "REPLACE WITH ACCESS_TOKEN") # Replace with your actual token

api = HfApi()

files = api.list_repo_files(

repo_id="TheBloke/deepseek-llm-7B-base-GGUF",

token=hf_token

)

for filename in files:

print(filename)

model_path = hf_hub_download(

repo_id="TheBloke/deepseek-llm-7B-base-GGUF",

filename="deepseek-llm-7b-base.Q8_0.gguf",

token=hf_token,

local_dir="C:/models"

)

Lakshminarasimhan S.

StoryListener | Polymath | PoliticalCritique | AgenticRAG Architect | Strategic Leadership | R&D

3 周

You can find the working code here. https://github.com/sln2737/AgenticRAG

回复

要查看或添加评论,请登录

Lakshminarasimhan S.的更多文章

  • Computational Power Savings: Moving LLM Embeddings from English to Sanskrit

    Computational Power Savings: Moving LLM Embeddings from English to Sanskrit

    Transitioning Large Language Model (LLM) embeddings from English to Sanskrit can significantly reduce computational…

    1 条评论
  • The PURE Principle: A Guiding Light for Ethical AI and Data Science

    The PURE Principle: A Guiding Light for Ethical AI and Data Science

    In an era where data is abundant but trust is scarce, a new paradigm has emerged—one that demands intelligence with…

    1 条评论
  • Learn to see the Data Right

    Learn to see the Data Right

    A Vision for Risk Prediction: The Spark of Curiosity In my classroom, I have given synthetic data that has been created…

    1 条评论
  • Life is a Mathematic Dance, No math, No dance - II

    Life is a Mathematic Dance, No math, No dance - II

    Life begins as an intricate mathematical dance, where cycles, probabilities, and chaotic patterns come together in a…

  • Life is a Mathematical Dance, No math No dance

    Life is a Mathematical Dance, No math No dance

    Mathematics and the Supernatural: Decoding the Hidden Forces of the Universe From the dawn of human thought, the…

    1 条评论
  • Feature Engineering in Quantum Machine Learning

    Feature Engineering in Quantum Machine Learning

    In classical machine learning, feature engineering plays a crucial role in improving model performance by transforming…

    1 条评论
  • Handling SQL-Like Tasks in Cassandra

    Handling SQL-Like Tasks in Cassandra

    Since Cassandra does not support many traditional SQL features, we need to redesign our approach to handle tasks…

  • Cassandra - A quantum data engine

    Cassandra - A quantum data engine

    Cassandra: The Quantum Data Engine Abstract As quantum computing advances, its integration with classical computing…

  • Unsupervised Decision Tree

    Unsupervised Decision Tree

    Unsupervised Decision Trees (UDT): Cracking the Code of Hidden Patterns Introduction: A Tree Without a Teacher Imagine…

  • Evolution of Activation function

    Evolution of Activation function

    The evolution of activation functions in neural networks reflects the progression of machine learning and deep learning…

社区洞察

其他会员也浏览了