Build a Powerful RAG Chatbot with Cohere's Command-R

Build a Powerful RAG Chatbot with Cohere's Command-R


Full tutorial - https://www.youtube.com/watch?v=HTihFrMzur4

In this tutorial, we're going to build a powerful retrieval augmented generation or RAG chatbot using Cohere's new Command-R large language model.

Command-R is a scalable language model optimized for Retrieval Augmented Generation (RAG) and tool use.

The best part of command-r is that we don't have to use any embedded models or vector databases to build RAG-based applications.

Cohere Incorporates RAG into LLMs, especially within question-answering frameworks. It seamlessly integrates with Cohere’s Embed and Rerank models to deliver best-in-class RAG capabilities. Notably, Command-R's outputs include clear citations, mitigating the risk of hallucinations and enabling users to easily access additional context from source materials.

Retrieval Augmented Generation (RAG) is a method for generating text using additional relevant information fetched from an external data source. The core idea is that providing relevant documents or context to the language model can greatly increase the factual accuracy and groundedness of the generated text.

With command-r, the RAG workflow typically involves three main steps:

  1. Generating search queries that can retrieve relevant documents for the given input prompt or question.
  2. Fetching those relevant documents from the specified data source using the generated search queries.
  3. Generating the final response augmented with the retrieved documents, often including inline citations to ground the output in the source material.

Setup and Dependencies

The app relies on a few key Python libraries:

  1. Streamlit: A framework for building data-driven web apps rapidly with Python.
  2. Cohere: The official Python client library for interacting with Cohere's AI models and APIs.

To access Cohere's APIs, you need to have a valid API key stored as an environment variable named COHERE_API_KEY. The code checks if this key is present and notifies the user if it's missing.

Core Functionality

The heart of the app is the generate_rag_response_with_citations function, which takes a user query as input and returns a response generated by Cohere's AI along with a list of relevant web citations.

Here's how it works:

  1. The function calls Cohere's chat endpoint, specifying the command-r model and the user query.
  2. It includes a an argument document, which enables the RAG capability to search the documents for relevant information to enhance the response.
  3. Cohere's AI generates a response (response.text) along with a list of citations (response.citations) from the web sources used.
  4. The function returns the response text and the list of citations.

Streamlit UI

The app presents a simple user interface built with Streamlit:

  1. A text area for the user to enter their query.
  2. A button to trigger the request to Cohere's AI.
  3. Once the button is clicked, the app calls generate_rag_response_with_citations with the user's query.
  4. The response from the AI is displayed.
  5. If there are any web citations, they are also displayed, with a notice indicating they were sourced from web searches.

import streamlit as st
import cohere
import os

# Replace 'your-cohere-api-key' with your actual Cohere API key
api_key = os.getenv('COHERE_API_KEY')
print(f"API Key: {api_key}")

# Ensure the API key is actually retrieved; otherwise, notify the user.
if api_key is None:
    st.error("COHERE_API_KEY environment variable not found. Please set it.")
else:
    # Initialize the Cohere client with the API key
    co = cohere.Client(api_key)

def generate_rag_response_with_citations(query, documents):
    """
    Generates a response to the user query using Command-R model with RAG capability
    by referencing a set of user-uploaded documents and includes citations in the response.
    
    Parameters:
    - query (str): The user's query.
    - documents (list): A list of documents provided by the user.
    
    Returns:
    - Tuple[str, list]: The generated response and a list of citations.
    """
    # Format documents for the API
    formatted_documents = [{"title": f"doc_{i}", "snippet": doc} for i, doc in enumerate(documents)]
    
    # Call the Cohere chat endpoint with the documents for RAG
    response = co.chat(
        model="command-r",
        message=query,
        connectors=[{"id": "web-search"}]
    )   

    # Extracting text and citations from the response
    response_text = response.text
    citations = response.citations

    return response_text, citations

# Streamlit UI
st.title('RAG with Citations - Command-r')

uploaded_files = st.file_uploader("Upload documents related to your query (text files):", accept_multiple_files=True, type=['txt'])
user_query = st.text_area("Enter your query:")

if st.button('Get Answer'):
    if not user_query:
        st.write("Please enter a query to proceed.")
    elif not uploaded_files:
        st.write("Please upload at least one document to proceed.")
    else:
        # Read the content of the uploaded files
        documents = [file.getvalue().decode("utf-8") for file in uploaded_files]
        
        response, citations = generate_rag_response_with_citations(user_query, documents)
        st.write("Answer:")
        st.write(response)
        
        if citations:
            st.write("Citations:")
            for citation in citations:
                cited_text = citation['text']
                document_ids = citation['document_ids']
                # Assuming document IDs are in the format "doc_x", extract and display the cited document snippets
                for doc_id in document_ids:
                    index = int(doc_id.split('_')[-1])
                    st.write(f"- {cited_text} (from document: {documents[index]})")        
Sri Laxmi

AI Product Manager | Generative AI | AI Products Builders Host| M.Sc at TUM

6 个月
回复

要查看或添加评论,请登录

社区洞察

其他会员也浏览了