Developing an AI bot powered by RAG and Oracle Database
What is RAG?
An excellent introduction to Retrieval Augmented Generation (RAG) can be found?here
We can build the UI of the Knowledge Assistant using Langchain, OCI Generative AI, and store data in Oracle 23ai Vector DB.
Customers can easily build the chatbot using Oracle 23ai Vector DB and OCI Generative AI.
To test these functionalities, you can visit our GitHub repository for the python rag chatbot. Follow the instructions in the README file to install the appropriate versions of the required software libraries.
Code and functionalities may change as a result of customer feedback.
Building a Full-Stack RAG Chatbot with OCI Generative AI and Oracle Vector Database (Python Powerhouse)
In the realm of chatbots, where responsiveness and knowledge are paramount, Retrieval-Augmented Generation (RAG) offers a compelling solution. This approach combines the power of large language models (LLMs) with the precision of database retrieval, making chatbots more informative and up-to-date. This blog delves into crafting a full-stack RAG chatbot using Oracle Cloud Infrastructure (OCI) Generative AI and Oracle Vector Database, all orchestrated by the versatile Python language.
Why OCI and Oracle 23ai Vector DB?
Building Blocks of the Chatbot:
Putting it All Together:
Now let’s elaborate on each step?for the python rag chatbot?which is available in Github.
Internally, we push the question along with the RAG chain in our Python code.
response = get_answer(rag_chain, question)
Step 2: How to build the RAG chain
Building the RAG chain involves multiple steps, as outlined below:
all_pages = load_all_pages(BOOK_LIST) // BOOK_LIST = [BOOK1, BOOK2, BOOK3, BOOK4, BOOK5, BOOK6]
document_splits = split_in_chunks(all_pages) // CHUNK_SIZE = 1000, CHUNK_OVERLAP = 50
whenever user decide to go for local embedding model, he/she?can use hugging face embedding or he/she can go for cohere embedding
embedder = create_cached_embedder()
def create_cached_embedder(): ## Initializing Embeddings model...
fs = LocalFileStore("./vector-cache/") # Introduced to cache embeddings and make it faster
if EMBED_TYPE == "<strong>COHERE</strong>": ## Loading Cohere Embeddings Model...
embed_model = CohereEmbeddings(
model=EMBED_COHERE_MODEL_NAME, cohere_api_key=COHERE_API_KEY
)
elif EMBED_TYPE == "<strong>LOCAL</strong>":
print(f"Loading HF Embeddings Model: {EMBED_HF_MODEL_NAME}")
model_kwargs = {"device": "cpu"}
# changed to True for BAAI, to use cosine similarity
encode_kwargs = {"normalize_embeddings": True}
embed_model = HuggingFaceEmbeddings(
model_name=EMBED_HF_MODEL_NAME,
model_kwargs=model_kwargs,
encode_kwargs=encode_kwargs,
)
# the cache for embeddings
cached_embedder = CacheBackedEmbeddings.from_bytes_store(
embed_model, fs, namespace=embed_model.model_name
)
return cached_embedder
We are providing an option in our chatbot where customers can configure the different databases for storing embeddings. The following code demonstrates this feature:
vectorstore = create_vector_store(VECTOR_STORE_NAME, document_splits, embedder)
Let's say we are going to use oracledb we have to pass the store_type as?"ORACLEDB" in the config rag file
领英推荐
def create_vector_store(store_type, document_splits, embedder):
global vectorstore
print(f"Indexing: using {store_type} as Vector Store...")
if store_type == "<strong>ORACLEDB</strong>":
connection = oracledb.connect(user="ADMIN", password="XXXXXX", dsn="XXXXXXX")
<strong> vectorstore = OracleVS.from_documents(
documents=document_splits,
embedding=embedder,
client=connection,
table_name="oravs",
distance_strategy=DistanceStrategy.DOT_PRODUCT
)</strong>
print(f"Vector Store Table: {vectorstore.table_name}")
elif store_type == "<strong>FAISS</strong>":
# modified to cache
vectorstore = FAISS.from_documents(
documents=document_splits, embedding=embedder
)
elif store_type == "<strong>CHROME</strong>":
# modified to cache
vectorstore = Chroma.from_documents(
documents=document_splits, embedding=embedder
)
return vectorstore
It will act as a?decoder and provide the response in plain text
By default we have disabled the reranking in our chatbot code.
# added optionally a reranker
retriever = create_retriever(vectorstore)
if llm_type == "OCI":
llm = OCIGenAI(
model_id="cohere.command",
service_endpoint="XXXXXXX",
compartment_id="mycompartmentId",
model_kwargs={"max_tokens": 200},
auth_type='SECURITY_TOKEN',
)
template = """Answer the question based only on the following context:
{context}
Question: {question}
"""
rag_prompt = ChatPromptTemplate.from_template(template)
<strong>Build the entire RAG chain</strong>
print("Building rag_chain...")
rag_chain = (
{"context": retriever, "question": RunnablePassthrough()} | rag_prompt | llm
)
Step 3:
User questions are processed along with the RAG chain, and the retriever finds the answer using the Gen AI LLM.
<strong> </strong>def get_answer(rag_chain, question):
?response = rag_chain.invoke(question)
?
Highlighting one of the Customer use Case solved using chatbot
? ?
Benefits of a Full-Stack RAG Chatbot:
?
Why Oracle 23ai?
The "best" vector database depends heavily on specific use cases, performance requirements, integration needs, and cost constraints. It's crucial to evaluate these factors for your particular application.
That said, let's analyze the three options we have mentioned in the demo:
?
? ? ? ? ?
ChromaDB, Oracle 23ai DB, and FAISS DB
ChromaDB is a popular open-source vector database designed for flexibility and ease of use. It's often preferred for rapid prototyping and smaller-scale projects due to its Python-centric nature.
Oracle 23ai DB is a proprietary vector database integrated into the Oracle database platform. It offers robust performance and scalability, especially for large-scale enterprise applications. However, it might have a steeper learning curve and higher costs associated with Oracle licensing.
FAISS DB is less known, and there's limited public information available. Without more details about its features and capabilities, it's difficult to provide a comprehensive comparison.
Key Factors to Consider
When choosing a vector database, focus on these aspects:
Recommendations
Additional Considerations
Conclusion
Leveraging OCI Generative AI, Oracle Vector Database, and Python empowers you to build a robust RAG chatbot that delivers engaging and informed user experiences across various domains. For enterprise-scale RAG chatbots, Oracle 23ai DB often emerges as a strong contender due to its performance, scalability, and integration with Oracle infrastructure.
Senior Software Engineer @ Microsoft | EX - Principal Software Engineer @ Oracle | DataStructure | Algorithm | Software Development @SpringBoot | Microservices | @leetcode
6 个月Very informative
Figuring Out ...
6 个月Great article!?Many organizations are grappling with latency issues and seeking solutions from LLM providers, but implementing caching strategies could offer a more immediate performance improvement. Since Oracle is already widely used as a database, leveraging its built-in caching features could help teams enhance performance without needing to onboard a new technology stack or go through lengthy approval processes. Thanks for shedding light on this critical topic!
Are there any metrics or techniques to ensure the responses are reliable and valid?
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
6 个月Building RAG chatbots feels reminiscent of early attempts at conversational AI, like ELIZA, which relied on pattern matching and rule-based systems. The key difference now is the vast amount of data available for training, enabling more nuanced understanding and generation. However, how do you ensure that the retrieved information from your knowledge base is not only factually accurate but also contextually relevant within the ongoing conversation thread?