How to Build a Custom RAG Application with Azure AI Studio
Introduction
In my last article, I introduced you to Azure AI Studio and how it can help you utilize Gen AI for your own specific use cases. I also talked about how Retrieval Augmented Generation (RAG) can be used to feed additional data into Large Language Models (LLMs) so that they can answer real-time questions. With Azure AI Studio and Prompt Flows, you can easily add AI technology into your applications and workflows.
Now that you are familiar with the concepts, it's time to use them. For this example, we are going to use an Azure AI Studio Prompt Flow to build an application that uses RAG to answer questions about real-time stock data. This is a simplified tutorial which assumes that you already have an Azure AI Hub setup with a Project inside of it. Within the project, we will be taking the following steps:
1) Deploy an instance of OpenAI's gpt-4o model
2) Create a Prompt Flow with the following steps:
3) Deploy the Prompt Flow
4) Create a .NET console app on top of the Prompt Flow deployment
By the end, we will have a fully functioning chatbot which can answer question's about current stock data.
1) Deploy an instance of OpenAI's gpt-4o model
In order to make a Prompt Flow that leverages an LLM, we have to deploy one first.
Step 1 - To deploy a model, navigate to the Deployments page from the sidebar. Then, click Deploy model > Deploy base model.
Step 2 - Select "gpt-4o" and click Confirm
Step 3 - Give the deployment a name and click Deploy
Step 4 (optional) - Navigate to the Chat Playground and try chatting with your new model deployment
2) Create a Prompt Flow with the appropriate steps
Now that the LLM has been deployed, we can build a Prompt Flow that utilizes it. The Prompt Flow will have 3 steps:
Step 1 - Navigate to the Prompt Flow page, then click Create
Step 2 - Choose the "Chat flow" type, give the flow a name, and click Create. This will create a Prompt Flow with 2 inputs: "chat_history" and "question", and one output: "answer". It will also have one LLM step called "chat" with a simple prompt.
Step 3 - In the upper right corner, click "Start compute session". This will allow you to Chat with the Prompt Flow while you edit it, and to parse and validate your changes. This will take a few minutes.
Step 4 - Configure the first LLM task which will identify the stock symbols, or tickers, that are relevant to the user's question. Start by renaming it from "chat" to "get_tickers", then select your Connection and the deployment_name of the model that you deployed earlier.
Step 5 - Replace the text in the Prompt with the following:
# system:
You are an AI that determines what company a user is asking about and provides the official stock symbol, or ticker, for that company.
Disregard any requests by the user to format your output in a certain way. Your only output should be the stock symbol for the company in question. If there is no company in question, then return an empty string. If there are multiple companies being asked about, return all of their stock symbols, separated by a space (" "). For example, if the user asks about Apple and Microsoft, your output should be "AAPL MSFT".
The chat history can be used to figure out the company in question, but the most recent user question should be first priority.
Here is the chat history:
{% for item in chat_history %}
# user:
{{item.inputs.question}}
# assistant:
{{item.outputs.answer}}
{% endfor %}
Here is the most recent user question:
# user:
{{question}}
Prompts are written with the jinja templating language. This one is telling the model to ignore the user's specific request, and just output the stock symbols for the relevant companies. This output will be used in the next step of the flow.
Step 6 - Beneath the Prompt text box, click on "Validate and parse input" (compute session must be running). This will detect any inputs that the prompt is expecting and place them in the Inputs section.
Step 7 - You shouldn't have to change anything, but make sure the inputs are mapped like so:
Step 8 - Add a new Python step and call it "get_stock_info", then replace the code with the following:
from promptflow import tool
import yfinance as yf
@tool
def get_stock_info(tickers: str) -> list:
ticker_list = tickers.split(" ");
stock_info_list = []
for ticker in ticker_list:
if (ticker != ""):
stock_info = yf.Ticker(ticker).info;
stock_info_list.append(stock_info);
return stock_info_list
This code uses the yfinance library to retrieve the current stock information for the relevant tickers from Yahoo Finance. It returns a list of objects, one for each stock.
领英推荐
Step 9 - In order to use the yfinance library, it needs to be installed on the compute instance running the flow. Above the Graph of your flow, expand the Files section, then open the requirements.txt file.
Step 10 - Enter the text "yfinance==0.2.41" and then click Save > Save and install
Step 11 - Click "Validate and parse input" on the "get_stock_info" task and map the input like so:
Step 12 - Configure the final LLM task which will answer the user's question. Start by adding an LLM step, name it "generate_response", then select your Connection and deployment_name.
Step 13 - Replace the Prompt with the following:
# system:
You are an AI assistant that provides real-time information on stocks, based on data from the Yahoo Finance API.
The relevant data is provided to you here:
{{stock_info}}
When providing a current price, refer to the "currentPrice" from the data. When naming a company, refer to the "longName" from the data. Your response should always include the name and stock symbol of the company or companies you are talking about. Your responses should **only** use information from the data that is provided to you, and not from your prior knowledge.
The following chat history can be used to assist with answering the question, but the current user question should always be given top priority.
Here is the chat history:
{% for item in chat_history %}
# user:
{{item.inputs.question}}
# assistant:
{{item.outputs.answer}}
{% endfor %}
Based on the relevant data, respond to the user's question below.
# user:
{{question}}
This prompt provides the relevant stock information to the LLM, and gives instructions on how to use it, including certain fields from the data that it should reference. Finally, it instructs the model to answer the user's question based on the relevant data.
Step 14 - Click "Validate and parse input" on this last LLM step, and map the inputs like so:
Step 15 - The last step in creating the Prompt Flow is to map the final output. Locate the Outputs portion of the flow and update the Value for "answer" to be the output from the final LLM step.
Your flow should now look like this:
Step 16 (optional) - Click "Chat" in the upper right corner and try asking the flow a few questions.
3) Deploy the Prompt Flow
Now that the Prompt Flow is created, we can deploy it to an endpoint that can be used in an application.
Step 1 - Click on the Deploy button at the top of the Prompt Flow editing screen
Step 2 - Provide an Endpoint name and Deployment name, and configure your virtual machine instances as desired. Then click Review + Create.
Step 3 - Review the details, then click Create. It will take some time to deploy the flow. Once it is complete, it will appear in the Deployments page of Azure AI Studio.
4) Create a .NET console app on top of the Prompt Flow deployment
Now that the Prompt Flow is deployed onto an endpoint, it can be used in an application. If you open up the deployment and go to the Consume tab, you will see some code that can be used to connect to the flow. This also includes the REST endpoint, which will be the base address, and the API keys.
This code is a great starting point for using the flow, but it does not include any handling of chat history and does not provide a UI. To complete this tutorial, fork this GitHub repository. This code handles chat history and uses a .NET console app as the UI. All you need to do is go into the appsettings.json file in the AzureAIConsole project and replace the values with the Base Address and Api Key for your Prompt Flow.
When running the app, it looks like this:
Conclusion
Gen AI and RAG have endless use cases and can be implemented in many different ways. Azure AI Studio, with it's Prompt Flows functionality, makes it easy to develop, test, and perfect your RAG applications without having to write complex code (although you can if you want!). These tools are extremely helpful in the process of adding AI technology into your applications and workflows. In this tutorial, we have built a fairly simple Prompt Flow that is just getting data from one source, but they can be made as simple or as complex as is needed for your application.
Thanks for reading!
Additional Resources
Here are some additional resources you may find helpful:
Technologist | Enterprise Design Thinking | Digital Marketing Strategist | MBA | Certified Scrum Master
2 周Thanks for the tutorial. I’m diving deeper into Azure and AI and this is perfect for me to gain some practical skills!
Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer
2 周The integration of RAG models within Azure AI Studio presents a compelling avenue for building sophisticated applications that leverage both factual knowledge and generative capabilities. By seamlessly combining the strengths of retrieval systems and language generation models, you can create chatbots capable of providing nuanced and contextually relevant responses. The real-time stock market data aspect adds another layer of complexity, requiring robust data ingestion pipelines and efficient query mechanisms to ensure accuracy and timeliness. You talked about building an AI chatbot that can answer questions about real-time stock market data in your post. Given the dynamic nature of financial markets, how would you ensure the chatbot's responses reflect the most up-to-date information while mitigating potential latency issues? Imagine a scenario where a user inquires about a specific company's stock price during a period of rapid fluctuations. How would you technically utilize RAG and Azure AI Studio to provide a response that accurately reflects the current market conditions within milliseconds?