登录查看更多内容

How to Harness Retrieval-Augmented Generation (RAG) to unlock Conversational-style Pricing Intelligence

Praveen Orvakanti

Principal Architect - Data & Analytics | Data Engineering & BI Leader | Driving Strategic Business Decisions | Microsoft Azure Certified | Udacity Certified

发布日期: 2024年10月5日

This article explores how RAG, together with Azure OpenAI and Azure AI Search, can transform the way we query and extract insights from data — especially where traditional methods rely heavily on dashboards and manual queries. With RAG, we turn this on its head by allowing AI-driven conversational interfaces to answer questions directly from structured/textual data.

Imagine you’re a fan of J.R.R. Tolkien’s world, specifically The Rings of Power — and your fascination with this fantasy has led you to explore how legendary metals like Mithril would be priced. That’s how I turned a real-world problem in my work at a manufacturing company into an exciting, hands-on exploration of Retrieval-Augmented Generation (RAG) to gain business insights from structured data. While the actual problem I’m solving revolves around pricing transparency for steel products, we’re diving into this example with Mithril to explain how this cutting-edge technology can work across industries.

For those unfamiliar, Mithril is an incredibly powerful, valuable and rare metal in Tolkien’s lore. As a fan of The Rings of Power and a data enthusiast, I created a dataset inspired by Tolkien’s world to simulate real-world pricing dynamics. But you could easily imagine this dataset being applied to more tangible commodities like steel, oil, or financial data.

Problem Statement

Pricing transparency and quick decision-making are critical to maintaining a competitive edge in today's rapidly evolving data landscape. Stakeholders often need to understand why a product is priced a certain way or how market conditions and customer demand affect pricing. This often involves complex manual data aggregation and ad-hoc analyses, resulting in delays.

But what if we could automate this process using cutting-edge AI to get real-time insights?

To explain this, I built a fake dataset revolving around hypothetical commodity, Mithril, inspired by The Rings of Power. Imagine Mithril being traded across Middle-Earth, sourced from mines in Khazad-d?m, and sold to clans such as Men, Elves, and Orcs. The dataset includes CRM data with offers, orders, delivery details, dates, and pricing information along with some macroeconomic indicators such as mine capacity, demand, and supply — creating a perfect playground for testing a RAG-based AI solution.

Expected Benefits

Self-Service capabilities: Stakeholders can ask complex questions without relying on technical teams.
Real-Time Insights: Answers are generated in seconds, allowing for quick decision-making.
Scalability: The system can easily be expanded to cover additional datasets and query complexity.

The Data: CRM and Customer Data

We structured the data into two datasets: CRM data representing orders of Mithril and Customer data representing buyers. These datasets serve as our foundation for extracting insights, and you’ll see how RAG allows us to answer complex business questions based on this structured data.

CRM Data: Includes fields like OrderID, CustomerID, OrderDate, Quantity, PricePerUnitUSD, TotalPriceUSD, and mine details such as location, capacity, demand, and supply indices.

Customer Data: Features details like CustomerID, Name, Region, Clan, and TransportationCosts, among others.

What Can You Ask?

This is where it gets exciting — imagine being able to ask real questions directly about this structured data.

Here are some sample questions and answers:

“What was the average price of Mithril over the last 6 months?”

“How has the demand for Mithril fluctuated across different regions?”

“How did geopolitical stability impact Mithril prices in Khazad-d?m?”

The AI platform answers these questions by parsing the data, generating insights that normally would require complex queries or pre-built dashboards. This is the real magic of using a RAG-based system.

What is RAG, and Why Use It?

Retrieval-Augmented Generation (RAG) is a powerful framework that combines the strengths of two key components: information retrieval and natural language generation. Instead of relying solely on a large language model’s (LLM’s) pre-trained knowledge, RAG enhances responses by pulling in real-time, external data sources.

High level RAG architecture — Retrieval-Augmented Generation

This approach helps to address some limitations of LLM’s:

Outdated public knowledge: LLM’s are typically trained on public data up to a certain date. For e.g. GPT-4o’s knowledge cutoff is May 2023.
No access to private/internal knowledge: Businesses have a lot of internal data and domain knowledge that LLM’s simply cannot get to.

领英推荐

From Theory to Practice: 4 AI and Data Science…

ángel Molina Laguna 3 个月前

FSQ Places Engine: combining advanced AI technology…

Foursquare 1 年前

Gleecus Gazette - March 2025

Gleecus TechLabs Inc. 2 周前

Meme to convey that simply throwing LLMs at data will not always suffice

In our case, we use Azure AI Search as the external data source, which indexes structured data (like pricing, demand, and customer information) and retrieves the most relevant documents based on user queries. Once the relevant data is retrieved, it is passed to a large language model — such as GPT-4o — which then generates a highly contextual, accurate, and insightful response.

RAG is particularly useful in scenarios where:

The information being queried is dynamic or too large to be entirely captured in the language model’s training data.
Responses need to incorporate real-time or domain-specific data.
You want to leverage the strength of both structured/unstructured data retrieval and generative AI in one integrated solution.

By using RAG, we ensure that the pricing intelligence platform doesn’t just rely on generic model knowledge but combines it with real-time, structured data for contextually rich answers.

System Architecture: The Workflow Behind the Curtain

Our system architecture is built around a simple yet powerful workflow:

User Query: A stakeholder asks a question (e.g., “What is the average price of Mithril in the last quarter?”).
Convert Query to Embeddings: The user’s natural language query is transformed into vector embeddings using Azure OpenAI’s text-embedding-ada-002 model.
Retrieve Relevant Documents: Azure AI Search indexes return the most relevant documents based on their vector similarity to the user’s query. We’ve set up two indexes: one for CRM data and another for Customer data.
Generate Answer: The query and relevant context are passed to GPT-4, which generates a human-like response based on the context retrieved from the structured data.
Return Response: The final answer is presented back to the user, often with explanations, trends, and comparisons.

Here’s a simple architecture diagram to explain how these pieces fit together:

Workflow of a RAG-based application using Azure OpenAI and Azure AI Search — RAG-based Pricing Intelligence workflow

Key Steps in Building the Solution

In this section, we will walk through the critical steps required to build a fully functional RAG-based pricing intelligence platform using Azure OpenAI and Azure AI Search. The journey starts with setting up the core services — OpenAI for generating natural language responses and Azure AI Search for indexing structured data. From there, we’ll break down our data into manageable chunks, upload it, and integrate the pieces to ensure smooth retrieval and generation of responses based on user queries.

Each of these steps is crucial in enabling the platform to handle large datasets, ensure scalability, and deliver real-time, insightful answers to complex pricing-related questions.

Create Azure OpenAI and Azure AI Search Services We started by setting up an Azure OpenAI service, deploying two models: GPT-4 for question answering and text-embedding-ada-002 for generating embeddings. In Azure AI Search, we created two indexes (CRM and Customer), each containing fields for relevant data. One key addition was the description field — a game-changer. This field summarizes each record’s content, enabling GPT-4 to interpret and answer questions more effectively.

def create_customer_index(customer_index_name, search_index_client):
    fields = [
        SimpleField(
            name="id",
            type=SearchFieldDataType.String,
            key=True,
            sortable=True,
            filterable=True,
            facetable=True,
        ),
        SearchField(name="CustomerID", type=SearchFieldDataType.String, sortable=True, filterable=True,
                    facetable=True),
        SearchField(name="Name", type=SearchFieldDataType.String),
        SearchField(name="Region", type=SearchFieldDataType.String, sortable=True, filterable=True,
                    facetable=True),
        SearchField(name="Realm", type=SearchFieldDataType.String, filterable=True),
        SearchField(name="Clan", type=SearchFieldDataType.String, filterable=True),
        SearchField(name="Contact", type=SearchFieldDataType.String),
        SearchField(name="GeopoliticalIndex", type=SearchFieldDataType.Double, filterable=True),
        SearchField(name="EconomicHealthIndex", type=SearchFieldDataType.Double, filterable=True),
        SearchField(name="PreferredSeason", type=SearchFieldDataType.String, filterable=True),
        SearchField(name="TransportationCostUSD", type=SearchFieldDataType.Double, filterable=True),
        SearchField(name="description", type=SearchFieldDataType.String, filterable=True),
        SearchField(name="vector", type=SearchFieldDataType.Collection(SearchFieldDataType.Single), searchable=True,
                    vector_search_dimensions=1536, vector_search_profile_name="myHnswProfile"),
    ]

    vector_search = VectorSearch(
        algorithms=[HnswAlgorithmConfiguration(name="myHnsw")],
        profiles=[VectorSearchProfile(name="myHnswProfile", algorithm_configuration_name="myHnsw")]
    )

    semantic_config = SemanticConfiguration(
        name="customer-semantic-config",
        prioritized_fields=SemanticPrioritizedFields(title_field=SemanticField(field_name="Name"),
                                                     content_fields=[SemanticField(field_name="Region"),
                                                                     SemanticField(field_name="Realm")])
    )

    semantic_search = SemanticSearch(configurations=[semantic_config])

    customer_index = SearchIndex(name=customer_index_name, fields=fields, vector_search=vector_search,
                                 semantic_search=semantic_search)

    result = search_index_client.create_or_update_index(customer_index)
    print(f'Customer Index {result.name} created')
    return result

Note: While the creation of Azure OpenAI services, AI Search services, and indexes are critical foundational steps, diving into the detailed process would make this article too long. For the sake of brevity, I’ll focus on the core concepts and implementation steps that directly relate to the Retrieval-Augmented Generation (RAG) framework. Setting up these services is well-documented in Microsoft’s official resources, and I encourage you to explore those if you’re starting from scratch.

Chunking and Uploading Data We used a Python script to break down large tables into smaller, manageable chunks. The challenge was ensuring the data remained contextual, so we added meaningful description fields for each chunk. This allowed the AI to make sense of the data, even if it was stored in smaller pieces. Here's a snapshot of how a processed chunk would look like (mimicking the structure of the created index):

{
  "id": "35c50d6f-279b-4d16-ae25-04d7ce60902d",
  "fields": {
    "OrderID": "ORD001",
    "CustomerID": "CUST031",
    "OfferID": "OFF068",
    "OrderDate": "2024-01-14 15:36:31.160817",
    "DeliveryDate": "2024-09-11",
    "DeliveryFrom": "Khazad-d?m",
    "DeliveryTo": "Moria",
    "Quantity": 5,
    "PricePerUnitUSD": 50000.0,
    "TotalPriceUSD": 277000.0,
    "Mine": "Grey Mountains Mine",
    "vector": [0.00153, 0.01304, ..., -0.01797, -0.00536]
  },
  "description": "Order ORD001 placed by customer CUST031 was delivered from Khazad-d?m to Moria..."
}

# Loop through all the chunk files in the directory
    for filename in os.listdir(chunk_directory):
        if filename.endswith('.json'):
            file_path = os.path.join(chunk_directory, filename)
            with open(file_path, 'r', encoding='utf-8') as chunk_file:
                chunk_data = json.load(chunk_file)

                # Unpack the fields and vector for individual upload
                document = chunk_data['fields']  # Unpack all key-value pairs from the 'fields'
                document['vector'] = chunk_data['vector']  # Add the vector to the document
                document['id'] = chunk_data['id']
                document['description'] = chunk_data['description']

                # Upload each document (normal upload)
                try:
                    result = search_client.upload_documents(documents=[document])
                    print(f"Upload of {filename} succeeded: {result[0].succeeded}")
                except Exception as e:
                    print(f"Failed to upload chunk {filename}: {e}")

Query Embeddings and Context Retrieval Once a user query is received, we convert the natural language input into vector embeddings and search our Azure AI indexes. The relevant documents are fetched and passed as context into GPT-4.

query_embedding = get_embeddings_vector(user_query, openai_client, embedding_model)
# choose appropriate number of nearest neighbors 
vector_query = VectorizedQuery(vector=query_embedding, k_nearest_neighbors=50,
                                       fields="vector")

Generating the Final Response Using the relevant context, GPT-4 generates a well-informed response. The system can answer complex, multi-faceted questions like “What was the demand index and price fluctuation for Mithril in Khazad-d?m?” — giving businesses actionable insights without manual intervention.

response = openai_client.chat.completions.create(
                model='gpt-4o',
                messages=[
                    {"role": "system", "content": "You are a helpful AI assistant for users, "
                    "who want to understand mithril and its pricing, orders, customers, "
                    "regions etc. Use the indexes and context given to you to "
                    "answer the questions."},
                    {"role": "user", "content": user_query},
                    {"role": "system",
                     "content": f"Here is some relevant information to help answer the query: {combined_context}"}
                ],
                extra_body={  # Azure AI Search index is passed here 
                    "data_sources": [
                        {
                            "type": "azure_search",
                            "parameters": {
                                "endpoint": search_crm_client._endpoint,
                                # This pulls from the SearchClient using env vars
                                "index_name": search_crm_index_name,
                                # Use the index from the environment variables
                                "authentication": {
                                    "type": "api_key",
                                    "key": azure_search_service_admin_key
                                }
                            }
                        }
                    ]
                }
            )

Conclusion: A Self-Service Platform for Real-Time Insights

This platform transforms the way businesses, especially those handling large datasets, extract insights. Whether it’s steel pricing in manufacturing or analyzing financial trends in banking, this RAG-based solution reduces dependency on technical teams and accelerates decision-making.

In this article, we’ve walked through how Retrieval-Augmented Generation can be applied to structured data — driving powerful insights in real-time. The power of this solution lies in its versatility, and with the right architecture, it can scale across industries and use cases.

If you’re a data enthusiast or a business leader, think about how RAG can revolutionize your industry. This technology goes beyond dashboards and SQL — it allows for true self-service analytics. Try it out for yourself, or get in touch to learn more!

How to Harness Retrieval-Augmented Generation (RAG) to unlock Conversational-style Pricing Intelligence

Praveen Orvakanti

Principal Architect - Data & Analytics | Data Engineering & BI Leader | Driving Strategic Business Decisions | Microsoft Azure Certified | Udacity Certified

Problem Statement

The Data: CRM and Customer Data

What Can You Ask?

What is RAG, and Why Use It?

领英推荐

System Architecture: The Workflow Behind the Curtain

Key Steps in Building the Solution

Conclusion: A Self-Service Platform for Real-Time Insights

社区洞察

其他会员也浏览了

Integrating Spring AI with Knowledge Graphs

Beyond Text: Multi-Sensory Inputs for Smarter AI-Driven Solutions

From RAG to TAG: A Quantum Leap in Generative AI with Table-Augmented Generation

InterSystems IRIS – the All-Purpose Universal Platform for Real-Time AI/ML

How GenAI Makes Messy Data Useful

How would AI change data extraction ?

AI-Powered Data Analytics: The Future of Smarter Decision-Making

Some Fundamentals – Process, Data and Models

Advanced Data Analytics

AI & Data Analytics Digest: March 2025 Edition I