Agentic RAGs: consolidated querying of SQL & Document repositories
Debmalya Biswas
AI/Analytics @ Wipro | x- Nokia, SAP, Oracle | 50+ patents | PhD - INRIA
1. Introduction
Given the way technology has evolved, we have become accustomed to thinking of certain solution constructs in terms of specific tools and platforms. For example, retrieval-augmentation-generation (RAG) is pervasive in the Gen AI world today. Given its emergence in the Gen AI era, we mostly think of RAGs in terms of unstructured data / document processing pipelines only. However, in reality, the RAG solution construct can very well be applied to retrieving structured data from a databases / data warehouses using SQL, and then augmenting a prompt with the retrieved structured data.
The above observation was recently triggered by Gen AI use-cases and their solutioning discussions, where we are increasingly seeing the need to query SQL and document repositories in parallel - consolidating the respective query pipelines.
For example, let us consider the following marketing data landscape with:
Given this, we will need to get insights from both the profile (unstructured) and transactional (structured) data repositories to answer the below query:
Provide a detailed summary of the the Top 3 sales agents for Product X in year?2023.
Needless to say, resolving the above query requires a sequential decomposition of the user query:
In the sequel, we outline a solution flow for the above scenario leveraging an Agentic AI framework for the query decomposition part, followed by Snowflake Cortex AI as the unified platform to implement RAGs over structured and unstructured data repositories.?
2. RAGs using Snowflake Cortex
Following the theme of this article, Snowflake is a good example of a platform known for its data processing prowess in the structured data world. However, it has made significant strides in the Gen AI world recently, positioning itself as a unified platform of choice for both Data & AI / Gen AI pipelines (refer to my previous article on the topic?—?link )?
Cortex AI (Fig. 1) is Snowflake’s Gen AI / large language model (LLM) platform with Snowflake ML catering to the more traditional AI (data science / predictive analytics) capabilities.
We start by showing how easy it is to build RAG pipelines on documents / unstructured data leveraging Cortex AI. We then extend the same to a consolidated architecture implementing RAGs over both structured and unstructured data repositories.
Together with fine-tuning and reinforcement learning with human feedback (RHLF), RAG forms one of the primary mechanisms of ‘adapting’ a pre-trained LLM with enterprise data making it more contextual, reducing hallucinations in the process— captured in the Gen AI lifecycle stages illustrated in Fig. 2.
Given a user query, a RAG pipeline literally consists of the 3 phases below (Fig. 3):
We highlight the Snowflake components below that map to the key architecture artefacts of the above RAG pipeline:
领英推荐
Snowflake also provides Cortex Search as a managed service implementing the retrieval and augmentation parts of a RAG pipeline.
So it can be combined with Cortex LLM Functions to create a full fledged RAG pipeline using data in Snowflake as the underlying context repository?—?illustrated in Fig. 4.
3. Text2SQL: Natural Language Queries over Structured Data
In this section, we extend the natural language querying capabilities to structured data stored in a SQL data repository. This capability is called Text2SQL, also referred to as Conversational BI, and we have explored Snowflake’s capabilities to deliver this in a managed and serverless fashion in a previous article (link ).
Cortex Analyst is the Text2SQL equivalent of Cortex Search to query structured data / SQL databases and data warehouses.
Cortex Analyst is actually an end-to-end text-to-answer solution as it returns the generated SQL as well?—?providing the final query response. It is very easy to deploy (Fig. 6): available as an API, and Snowflake also provides a simple Streamlit application that can be deployed with a few lines of code.?
Overall, it provides the following three differentiating features with respect to other Text2SQL tools in the market:
We all know that LLMs hallucinate, so the first (and best) thing to do is to validate the system’s understanding of a given query with the user?—?before responding with the final answer. Cortex Analyst enables this by:
Secondly, Cortex Analyst addresses the data repository metadata mapping problem with the help of a semantic model.
The semantic model is the bridge mapping the domain or business specific terms used by users to the database schemas.
These additional semantic details, like more descriptive names or synonyms, enable Cortex Analyst to answer natural language queries more reliably.
Finally, Cortex Analyst provides a lot of flexibility with respect to choosing the underlying LLM. By default, Cortex Analyst leverages Snowflake-hosted Cortex LLMs, which have been heavily fine-tuned for text-to-SQL generation tasks and are one of the most powerful LLMs available today. It is however possible to explicitly opt-in to allow Cortex Analyst to use the latest OpenAI GPT models, hosted by Microsoft Azure, alongside the Snowflake-hosted models.
At runtime, Cortex Analyst will select the optimal combination of models to ensure the highest accuracy and performance for each query.
4. RAG Agents consolidating SQL and Document?Queries
In this section, we tie everything together by outlining an Agentic AI framework to build RAG pipelines that work seamlessly over both structured and unstructured data stored in Snowflake. The reference architecture of such a RAG Agent is illustrated in Fig. 7.
We first focus on the query decomposition aspect. Given a user task, the goal of an AI agent is to identify (compose) an agent (group of agents) capable to executing the given task. This is achieved by leveraging an LLM as a reasoning engine capable of decomposing the task (query in this case) into sub-tasks, with execution of the respective agents orchestrated by an orchestration engine, e.g. LangGraph .?
The Supervisor agent is responsible for coordinating the overall execution, routing (via if-then-else logic) to the individual agents. As such, the Router is primarily used to route the (SQL and/or document related) sub-queries to the SQL & Document Query agents.
The SQL & Document Query agents are self-explanatory, and leverage the respective Snowflake Cortex Analyst and Search components detailed earlier to query the underlying SQL and Document repositories. Finally, to complete the RAG pipeline, the retrieved data is added to the original query (prompt)?—?leading the generation of a contextualized response.
To conclude, the lines between Data and AI / Gen AI platforms are blurring, and companies like Snowflake are making it ever so easier to leverage Gen AI / LLM capabilities on secure and governed data already stored in Snowflake. So it is high time to give Snowflake’s Cortex AI a shot for your strategic Gen AI use-cases.
AI/Analytics @ Wipro | x- Nokia, SAP, Oracle | 50+ patents | PhD - INRIA
8 小时前The full article is now published in AI Advances https://ai.gopubby.com/agentic-rags-extending-rags-to-sql-databases-1509b25ca3e7
AI Governance @ NLB
2 周Thanks for sharing. Any information on how to evaluate relevance and accuracy of resposes of such system that integrates such different information sources (SQL vs. documents vs. LLM)?
Thank you for sharing. As always, an outstanding article.
A patent engineer, an engineer and an inventor. - THINKWARE Corporation General Manager (Leader of Intellectual Property Department)
3 周?? ??? ???.