Quaestor-AI: An Extensible Framework for Advanced Retrieval-Augmented Generation
Sanjiv Kumar Jha
Enterprise Architect driving digital transformation with Data Science, AI, and Cloud expertise
Introduction
Quaestor AI is an innovative framework designed to address the limitations of current Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems. It offers a flexible, extensible architecture that allows for customization at various levels, from knowledge base management to query processing and evaluation.
For the latest source code and implementation details, please refer to our GitHub repository: https://github.com/sanjivjha/Quaestor-AI
Key Features and System Architecture
1. Dynamic Knowledge Base
Quaestor AI employs a dynamic knowledge base that can be continuously updated and expanded:
class SelfRAGSystem:
def ingest_pdf(self, pdf_path: str) -> int:
loader = PyPDFLoader(pdf_path)
documents = loader.load()
text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
texts = text_splitter.split_documents(documents)
added_docs = self.knowledge_base.add_documents(texts)
return len(added_docs)
2. Multi-Stage Query Processing
The system implements a sophisticated query processing pipeline:
class AnswerEvaluator:
def evaluate(self, query: str, answer: str) -> AnswerEvaluation:
eval_prompt = PromptTemplate.from_template(
"Evaluate the following answer for relevance, completeness, and accuracy:\n\n"
"Query: {query}\nAnswer: {answer}\n\n"
"Provide scores (0-1) and explanations for each criterion."
)
eval_result = self.llm(eval_prompt.format(query=query, answer=answer))
# Parse eval_result and return AnswerEvaluation object
3. Iterative Query Refinement
To improve performance on complex queries, the system can refine queries based on initial results:
class QueryEnhancer:
def enhance_query(self, original_query: str, context: str, previous_answer: str, evaluation: AnswerEvaluation) -> str:
enhance_prompt = PromptTemplate.from_template(
"Given the original query, context, previous answer, and evaluation, suggest an improved query:\n\n"
"Original Query: {original_query}\nContext: {context}\n"
"Previous Answer: {previous_answer}\nEvaluation: {evaluation}\n\n"
"Improved Query:"
)
return self.llm(enhance_prompt.format(
original_query=original_query,
context=context,
previous_answer=previous_answer,
evaluation=evaluation
))
4. Transparent Processing
The system offers a debug mode that provides insights into the decision-making process:
def process_query(self, query, message_placeholder):
self.log_action("Query Received", f"Query: {query}")
try:
response, iterations = self.rag_system.query(query)
full_response = f"**Answer:** {response}\n\n"
if st.session_state.debug_mode:
full_response += "**Process Details:**\n"
for iteration in iterations:
full_response += f"Iteration {iteration.get('iteration', 'N/A')}:\n"
full_response += f"- Strategy: {iteration.get('strategy', 'N/A')}\n"
full_response += f"- Explanation: {iteration.get('explanation', 'N/A')}\n"
# ... more debug information ...
return full_response
except Exception as e:
error_message = f"Error processing query: {str(e)}"
self.log_action("Query Processing Failed", error_message)
return error_message
Extensibility and Customisation
Quaestor AI is designed as a flexible framework that can be extended and customized to meet specific needs. Here's how you can leverage its extensibility:
领英推荐
1. Federated Knowledge Structure
Instead of centralising all information, Quaestor AI supports a federated approach:
class FederatedKnowledgeBase:
def __init__(self):
self.local_store = FAISS.from_texts(["Initial empty knowledge base"], self.embeddings)
self.external_sources = {}
def add_external_source(self, name: str, source: Callable):
self.external_sources[name] = source
def query(self, query: str, sources: List[str] = ["local"]):
results = []
if "local" in sources:
results.extend(self.local_store.similarity_search(query))
for source in sources:
if source in self.external_sources:
results.extend(self.external_sources[source](query))
return results
# Usage
knowledge_base = FederatedKnowledgeBase()
knowledge_base.add_external_source("enterprise_db", query_enterprise_database)
knowledge_base.add_external_source("pubmed", query_pubmed_api)
This structure allows easy integration with enterprise knowledge bases or public databases without copying all data locally.
2. Custom Tool Integration
You can extend the system's capabilities by adding custom tools:
class SelfRAGSystem:
def add_tool(self, tool: Tool):
self.tools.append(tool)
if self.agent_executor:
self.agent_executor.tools.append(tool)
# Example: Adding a custom PubMed search tool
class PubMedSearchTool:
def search_pubmed(self, query: str) -> str:
# Implement PubMed search logic here
pass
def get_tool(self) -> Tool:
return Tool(
name="PubMed Search",
func=self.search_pubmed,
description="Search PubMed for medical research papers"
)
rag_system = SelfRAGSystem()
pubmed_tool = PubMedSearchTool()
rag_system.add_tool(pubmed_tool.get_tool())
3. Pluggable Evaluation and Classification
The evaluation and classification mechanisms can be customised:
class SelfRAGSystem:
def set_answer_evaluator(self, evaluator: BaseEvaluator):
self.answer_evaluator = evaluator
def set_query_classifier(self, classifier: BaseClassifier):
self.query_classifier = classifier
# Custom Evaluation Example
class SentimentBasedEvaluator(BaseEvaluator):
def evaluate(self, query: str, answer: str) -> AnswerEvaluation:
sentiment = analyze_sentiment(answer)
return AnswerEvaluation(
relevance_score=sentiment.relevance,
completeness_score=sentiment.completeness,
accuracy_score=sentiment.accuracy
)
# Custom Classification Example
class DomainSpecificClassifier(BaseClassifier):
def classify(self, query: str) -> str:
if "medical" in query.lower():
return "medical_rag"
elif "legal" in query.lower():
return "legal_rag"
else:
return "general_rag"
rag_system.set_answer_evaluator(SentimentBasedEvaluator())
rag_system.set_query_classifier(DomainSpecificClassifier())
4. Dynamic Query Enhancement
When the local knowledge base is insufficient, the system can dynamically query external sources:
class QueryProcessor:
def process_query(self, query: str) -> str:
local_answer = self.query_local_knowledge_base(query)
if self.answer_evaluator.is_satisfactory(local_answer):
return local_answer
enhanced_query = self.query_enhancer.enhance(query, local_answer)
external_answer = self.query_external_sources(enhanced_query)
return self.combine_answers(local_answer, external_answer)
def query_external_sources(self, query: str) -> str:
for tool in self.external_tools:
if tool.is_relevant(query):
return tool.execute(query)
return ""
Practical Implementation
Here's how you might use these features in practice:
# Initialize the system
rag_system = SelfRAGSystem()
# Add custom knowledge sources
rag_system.add_external_source("enterprise_db", EnterpriseDBConnector())
rag_system.add_external_source("pubmed", PubMedAPIConnector())
# Add custom tools
rag_system.add_tool(CustomCalculatorTool().get_tool())
rag_system.add_tool(DomainSpecificSearchTool().get_tool())
# Set custom evaluation and classification
rag_system.set_answer_evaluator(IndustrySpecificEvaluator())
rag_system.set_query_classifier(MultiLabelClassifier())
# Use the system
query = "What are the latest treatments for type 2 diabetes?"
answer, process_details = rag_system.query(query)
print(f"Answer: {answer}")
print("Process Details:")
for step in process_details:
print(f"- {step['description']}: {step['result']}")
Conclusion
Quaestor AI offers a comprehensive solution to the limitations of current LLM and RAG systems. Its flexible architecture allows for customisation at every level, from knowledge base management to query processing and evaluation. By providing a federated knowledge structure, an extensible tool ecosystem, and pluggable components, it enables the creation of specialized, adaptive AI research assistants tailored to specific domains and use cases.
Whether you're integrating with enterprise systems, adding domain-specific tools, or implementing custom evaluation criteria, ResearchPal AI provides the flexibility to build a system that meets your unique requirements while addressing the common challenges in information retrieval and knowledge synthesis.
For developers and researchers interested in contributing to or extending ResearchPal AI, we encourage you to explore our https://github.com/sanjivjha/Quaestor-AI, where you'll find detailed documentation, contribution guidelines, and the latest updates to the framework.
Leading the Analytics CoE for digital transformation of the largest energy company of India
2 个月Great effort! Developers just need to build connectors to different sources and it can be a part of this repository of knowledge. Will be of good help for Q and A on technical manuals, compliance guidelines etc.