Developing LLM Powered XApplications: A Low/No Code Chat Application using Prompt Flow (6/n)
The software development in the recent times is experiencing a massive boost and rebirth with the integration of generative AI solutions. Most industries are disrupted and so are the software applications that were developed pre-AI era.
We looked at various articles in this series to learn and understand how to develop large-language-model powered software applications. The options are endless! In a nutshell - one can integrate with a foundation model such as Meta's Lllama 2 or OpenAI's GPT4. Or one can fine-tune these models as well as many others that are commercially and free (via open-source) available.
However, designing, developing, deploying and productionizing AI powered applications is complicated, complex and time intense work. The traditional development cycle often involves multiple steps and tools, which can lead to inefficiencies and delays.
Let me introduce you to lates offering from Microsoft - the Microsoft's Prompt Flow - tool to develop executable flows that bridge LLMs, prompts, and Python tools which are visualised through a graphical interface.
The Prompt flow comes out with a long list of objectives and ambitious goals. It tries to alleviates the frustrations choosing the models, testing and deploying them as well as finding out the right prompt. We will delve into looking at the Prompt flow in this article and put it to test by developing a no-code chat application talking to OpenAI's GPT 3.5 model.
Let's jump in!
Microsoft's Prompt Flow
Prompt Flow is a unified platform to create, manage, evaluate, and deploy AI applications. It facilitates the accelerated development of pipelines consisting of LLMs, Python tools, efficient prompts and data sources to power Large Language Models (LLMs) based AI applications called "accelearted apps or XApps". It allows developers to create executable flows that bridge LLMs, prompts, and Python tools, visualised through a graphical interface.
I usually call these non-traditional, AI powered software applications as XApps - accelerated applications.
Goal of Prompt Flow is to alleviate hurdles in AI application development with Large Language Models (LLMs). It tackles the complexity and inefficiency inherent in the traditional development cycle by streamlining the process from ideation to deployment - thus significantly accelerating the development timeline. It also promises to simplify the evaluation and optimisation of prompts and flows.
Let's dig in to get started!
Getting Started in Azure Portal
Login to Azure Portal and get your Machine Learning resource setup. It might take a couple of minutes to get the resource ready. Once the ML resource is ready, click the link to visit the ML dashboard (or you can visit directly at ml.azure.com too).
At the bottom of the page, you should see a link to launch the ML studio:
Once you launch the ML studio, you should see a "Prompt Flow"(preview) menu on the side bar:
As you can see, similar to Model catalog that we explored in the last article - the Prompt flow is another recent feature released by Microsoft in preview mode. You can expect improvements, changes and modifications to the feature when in preview mode.
Clicking on the Prompt flow menu in the side bar will lead us to the flow designer. Here's where we are expected to create, manage, test and deploy a flow. We will work with OpenAI's GPT3.5-turbo model. However, we can use any model as per our requirement by simply creating a connection (which we will see shortly). One of the objectives of adopting the Prompt flow is to swap models as we wish.
Before we start, we will need to get the compute (virtual machine) instance ready for deploying our prompt flow, which is discussed in the next section.
Instantiate a Compute Instance
The ML resource that we've created a moment ago wouldn't be having a virtual machine (compute resource) attached to it. When we have a flow created, we need a runtime to run it. This runtime will need computing power and hence we must create an instance if there's not available.
Under the "Manage" side menu, click on the "Compute" menu item, which leads to the Compute dashboard. Here you can pick and choose your VM by clicking on the Create button and choosing the appropriate VM. In my case, I chose the least powerful spec as my intention is to develop and test (Standard_DS11_v2 VM with 2 cores, 14GB RAM, 28GB storage).
The instance creation would take a few minutes - wait until the VM is instantiated.
Creating a LLM Connection
In our example, we will work on using a simple use case of using OpenAI's GPT model to ask a question. We need to set the connector to OpenAI ready before we start working through the creating a flow and experiment.
Let's setup a connection to OpenAI's GPT3.5 model by clicking on the Connections tab on the Prompt flow dashboard and choosing the OpenAI menu item that appears when you click the "Create" button.
This is where you'd ideally create your external connections to your LLMs and other tools like Serp (SerpApi scrapes the Google search results), or Qdrant or Weaviate (both are vector databases). We will stick to OpenAI in our case.
Provide your OpenAI's API and Organisation keys (make sure you provide the ID of your organisation - the hashed code something like org-blahblahblah..) in the form.
Once the connector is ready, we are in business (do make sure you have an API key for the OpenAI's model - please create an account and API keys by visiting platform.openai.com).
Let's see how we can create the flow in the following section.
Creating a flow
Now that we have our connection to OpenAI is tested and ready, we can jump into creating a new flow by going to the "Flows" tab and clicking on "Create" button. The action will pop the following widget screen where you can see the common use cases that we could use LLMs with our use cases:
There are three fundamentally flows that we can create using the Prompt flow:
We will create a simple chat flow (you can surely experiment other flows including adding your own code tools if needed to customise the flow) here.
Chat flow in action
Let's create a simple chat flow - objective is to prompt a question to GPT model and fetch a completion.
Click on the "Chat flow" tile to create the flow - input the folder name (folder hosts all the relevant files) appropriately. Each flow consists on a "flow.dag.yaml" file, some source code files and other folders, all bundled up under this folder.
The flow dashboard opens up with a flow component view:
The right hand side graph depicts the main flow components:
Let's look at these components individually:
Inputs
Expand the `inputs` component in the dashboard - you can see two variables: the chat_history and question. The flow maintains the chat_history variable to keep the history of the chat for that session.
The second variable is our prompt - the user is expected to fill in the question when the flow runs. We will shortly see this in action.
In our case, we don't need to do anything else except when this flow is run, we'd need to provide a prompt question.
领英推荐
Chat
The Chat component is the one that connects to LLM (GPT3.5 model) - it passes on the user's question to the model and expects the answer (completion). The output from the LLM (chat completion) will then be fed to the outputs components.
As this component connects to the LLM, we need to make sure the setting of the connector is turned on. This is where we'd be using our previously created OpenAI connector.
Choose the connector in the dropdown for picking up the connection, as shown below:
Choosing the open-ai-connector (the connection that I had created with OpenAI's API and Org keys) will show further model and model parameters - I've picked the model gpt-3.5-turbo with the temperature as 0.7 and rest as default:
We are not going to look at the advanced functions or the function calling. However, one thing I wish to show you is the prompt. The current prompt is auto-created for us, and is shown as the following
system:
You are a helpful assistant.
{% for item in chat_history %}
user:
{{item.inputs.question}}
assistant:
{{item.outputs.answer}}
{% endfor %}
user:
{{question}}
As you can see we are sending this prompt with the "question" that's gathered from the user as part of the inputs component. You can surely modify this prompt as you wish.
Outputs
The output of the LLM is then fed to the final component of the graph - the outputs.
The answer variable in the outputs component gets fed the response of the chat component - which will be the LLM's response.
All the flow is all setup and ready to go. Let's test it!
Testing the chat flow
Let's run the who flow by clicking the "Chat" button on the top right corner. And then enter the input just as you'd do as a prompt on ChatGPT. I've asked for a simple question about BTC. The question (prompt) is then hit the openAI and come back with a response as you see in the image below:
Yay! The chat flow successfully invoked the gpt3.5 model and got an answer to us without us writing a single line of code! Awesome!
You can also check the Outputs component's Output tag to get more details about how many tokens were spend and the time taken etc:
Let's take a minute and understand what we've done so far. We managed to create a no-code solution to interact with a LLM using Prompt Flow in just a few minutes. That is terrific, isn't it!
Of course, we tested and tried in the Azure Portal; the natural step is to get it deployed so it is available for public consumption. Let'd do that - let'd deploy this flow.
Deploying the flow
We can deploy the flow so the endpoint gets created which then can be used for public consumption. Click on the deploy button and follow the instructions (they are pretty straight forward). Give the endpoint a name and few other details (don't use more than 1 compute instance for scaling - you aren't using this in production yet):
Once the deployment action is kicked, wait for few minutes before you receive a notification saying the endpoint is all ready!
You can head over to the Endpoints section on the left hand menu and select our endpoint. We can then test it by clicking the "Test" tab, as the following image shows:
We can find the API endpoint authentication keys required for endpoint consumption to be found in the "Consume" tab. Copy the primary or secondary key and invoke the endpoint from a Python program or even Postman!
The Test tab provides sample code in Python, C# and R to test the endpoint - you can always copy the code and test it by yourself.
Before wrapping up, let me give you a bonus point - we can use Visual Studio for Code for all this development!
Using VS Code
While I can't delve into much as this article is already growing out of proportion, I can surely mention that we can do the development of prompt flows using our very own Visual Studio for Code. All we need to do is install Azure Prompt Flow for VC Code extension.
We can develop the prompt flows locally using VS Code as well as deploy them from our IDE! It has indeed a visual editor to show us the graph of our components too! How cool is that!!
Wrap up
My weekend is coming to close and so is this article :)
In this article, we looked at Microsoft's latest offering of Prompt Flow to develop, test and deploy LLM powered applications.
Developing AI powered applications using Prompt Flow is quite handy! The grunt work of writing everything from scratch is taken away. While there are a few gotchas, the tooling for prompt flow works pretty good.
There are numerous goals/objectives of Prompt Flow, including choosing the right model and iterating to find an efficient prompt after testing. We surely can delve these goals in some other articles, but for now, ciao!
Product & Data Strategy | GenAI | Enterprise Architecture | Cloud Migration | Databases | Data Streaming | APIs | Lead Solutions Architect/Professional Services Architecture & Leadership at DataStax
11 个月Thanks for sharing. Checkout this example of building even driven streaming generative AI application seamlessly with low/no-code tools like LangStream DataStax Henry Issac Mudumala https://docs.langstream.ai/get-started