?? Copilot for Microsoft 365 Plugins – The Intelligent Plugin Design Pattern
Mahmoud Hassan
Microsoft MVP | Empower enterprises to thrive with Microsoft Copilot & Modern Workplace AI solutions
Plugins are extensions that augment the capabilities (Skills) of Copilot for Microsoft 365, allowing it to interact with LOB apps and services. They can expand a user’s capabilities by enabling the interaction with your LOB APIs, via natural language conversation.
When building plugins 80% of the time you will rely on the Copilot Orchestrator to map implicit constraints from the user's prompt to slots (plugin parameters). Then you will use these parameters to make a downstream API call to a LOB apps or services and get results that you may postprocess and return to Copilot as a plugin response.
However, for the remaining 20% you may need to make your plugin more intelligent by accepting an entire user query and reasoning over it grounded on your LOB domain knowledge and giving back an answer to this dynamic user query. What can you do with this plugin use case then?!
In this article I will explain how you could use my proposed “The Intelligent Plugin” design pattern to make Copilot for Microsoft 365 plugins more intelligent by giving it access to the intelligence of A large language model (LLM) supported by a Retrieval-Augmented Generation (RAG) workflow. Next, I will give some sample scenarios where this pattern could be useful.
The Intelligent Plugin Design Pattern
In some complex scenarios you will need your plugin to do more than just receiving some parameters to a call a downstream API. Sometimes you will need it to respond by reasoning over a whole user prompt (User query) and this is what the “The Intelligent Plugin” design pattern is all about.
I tried to come up with 2 options for the pattern Pro-Code & Low-Code, now we can look at more details about each option's detailed flow.
The Intelligent Plugin Design Pattern: Pro-Code Option
Firstly, let’s explain the detailed flow for “The Intelligent Plugin” design pattern utilizing the Copilot for Microsoft 365 Message Extension Plugins, and how the user and the Copilot Orchestrator will interact with the plugin.
1. User Prompt: Its start by the user typing a prompt to Copilot through Microsoft Copilot Chat.
2. Reasoning & Calling: The orchestrator reasoning over the prompt and if the plugin needs to be invoked the orchestrator will pass the single parameter that the plugin needs which is the extracted user query.
? Note: For a good example of how to create a plugin that takes in a user query, you can have a look at the Document Search plugin sample.
3. Orchestration Layer: In this step, the plugin will initialize and configure the plugin orchestration framework. You can choose from any of the popular ones (Semantic Kernel, Azure AI Studio Prompt Flow, LangChain, etc.) depending on your preference, language support, or required features. Then the plugin will pass the user query to the selected orchestration framework to kick off the plugin workflow logic.
4. Retrieval-Augmented Generation (RAG): While this is an optional step, but I think for most of the scenarios you will use the orchestration framework to retrieve relevant information from your LOB domain knowledge related to the user query using vectors or hybrid search on the vector database that you will choose (Azure AI Search, Cosmos DB, etc.) which you will use to ground the final plugin prompt that you will send to the LLM to reason over and answer.
5. LLM Calling: Now at this step you will use the orchestration framework to call the LLM with the complete plugin prompt (Plugin System Prompt + User Query + Search Result), so the LLM applies its learned knowledge to respond to the user query (LLM Inference).
6. Response Post-Processing: If you need to apply some post-processing to the LLM response, you will need to ask the LLM to respond in a structured format (JSON for example). You can do this by using the plugin system prompt to enforce this or by using the JSON mode parameter if you are using one of the new models.
7. Plugin Response: The final step is to use the post-processed response from the previous step to create your plugin's final response that the plugin will send back to the Copilot Orchestrator. The Copilot Orchestrator will then process and forward it back to the user.
While the "The Intelligent Plugin" design pattern end-to-end flow is somewhat complicated, but using this design pattern will enable some scenarios that are not possible with the standard plugin flow.
The Intelligent Plugin Design Pattern: Low-Code Options
Now we will talk about how to achieve the same pattern with the Low-Code option and state the challenges I encountered when assessing the current plugins Low-Code features that we have.
Let me start with the challenges I encountered while designing the “Intelligent Plugin” design pattern Low-Code Option.
Initially I thought about using the AI Prompt plugin for the end-to-end flow implementation, but currently there is no way utilizing the “Data used” option to do search in Dataverse as from within the AI Prompt builder you can just use filter based on single value.
The Generative answers node is a great option for implementing RAG & LLM, and you can also influence the output of the Generative answers node by adding custom instructions as you will see in option 3 below. However, last time I checked the Generative answers node fails to return a structured response (JSON) for postprocessing utilizing the custom instructions feature.
Now let’s explain the detailed flow for utilizing the Copilot for Microsoft 365 Conversational Plugin, and how the user and the Copilot Orchestrator will interact with the plugin.
Low-Code Option 1 – The most flexible and complex option
1. User Prompt: Its start by the user typing a prompt to Copilot through Microsoft Copilot Chat.
2. Reasoning & Calling: The orchestrator reasoning over the prompt and if the plugin needs to be invoked the orchestrator will pass the single parameter that the plugin support which is the extracted user query. The parameter will be added to the system variable Activity.Text within the conversational plugin Copilot Studio topic.
3. Orchestration Layer: Here we are going to use the plugin Copilot Studio topic as our plugin orchestrator.
4. Retrieval-Augmented Generation (RAG): While this is an optional step, but I think for most of the scenarios you will use the plugin topic to retrieve relevant information from your LOB domain knowledge related to the user query. Currently Dataverse doesn’t support vector search so the only way to get relevant LOB domain knowledge information is to call the LLM using a custom connector as described in next to ask the LLM to extract the user query main keywords then use the Dataverse "Search rows" action to get the relevant records from Dataverse. Or you can use a different LOB domain knowledge database that support vector search and connect to it using a custom connector.
5. LLM Calling: Now at this step you will use the plugin topic to call the LLM with the complete plugin prompt (Plugin System Prompt + User Query + Search Result), so the LLM applies its learned knowledge to respond to the user query (LLM Inference). To prepare the Dataverse search result for the final prompt, you can use the “Set a variable” node and some kind of Power Fx formula. And to make the call to the Azure OpenAI Service, you will need a custom connector, because we don't have OOB one available.
6. Response Post-Processing: If you need to apply some post-processing to the LLM response, you will need to ask the LLM to respond in a structured format (JSON for example). You can do this by using the plugin system prompt to enforce this or by using the JSON mode parameter if you are using one of the new models.
领英推荐
7. Plugin Response: The final step is to use the post-processed response from the previous step to create your plugin's final response that the plugin will send back to the Copilot Orchestrator. The Copilot Orchestrator will then process and forward it back to the user.
As you can see, the flow is not ideal yet, because Dataverse lacks the vector search and there is no built-in connector for the Azure OpenAI Service. However, the Low-Code Copilot for Microsoft 365 Plugins is still in public preview and I expect more features to be added in the future before the GA release.
Low-Code Option 2 – Simplifying the LLM Calling utilizing the Prompt Action
This option is the same as the previous one except we are using a Prompt Action instead of a custom connector to call the Azure OpenAI Service.
The Prompt Action can now be invoked from the plugin Copilot Studio topic (The plugin orchestrator).
The Prompt Action (AI Builder Prompt) now has all the features you need, from specifying the output as Text or JSON, to selecting the model GPT 3.5 or 4, to adjusting the inference temperature.
When you create a custom prompt in prompt builder, the panel on the right includes a Settings section. This section allows you to set these parameters: Version of the generative AI model & Temperature
I really like all the new features that are coming to the Prompt Action (AI Builder Prompt) and how they simplify our work when we extend Copilot for Microsoft 365 or create our own custom copilot.
Low-Code Option 3 – The simplest option to use
Similar to the pervious 2 options we are utilizing the conversational plugin Copilot Studio topic as the plugin orchestrator.
Retrieval-Augmented Generation (RAG) & LLM Calling: Here we are going to use the Generative Answers to implement the RAG and the LLM Calling inference step with the retrieved information.
The Generative Answers node has been updated to include multiple knowledge sources utilizing also the new Copilot Connectors. The following modern knowledge sources are supported.
Also, we have the classic data sources including the Azure OpenAI Services on your data that I didn’t test in this context.
Response Post-Processing: Response Post-Processing: Last time I checked, the generative answers node was not able to produce a structured response (JSON) that can be post-processed using the custom instructions feature. A possible solution if you require post-processing is to pass the response from the generative answers node to a Prompt Action (AI Builder Prompt) and request it to convert and reply in JSON as described above.
As you see this is the simplest option to use, but it has some limitation. You can't customize your system prompt, select the model you use for inference, change your output format, or adjust your temperature.
The Intelligent Plugin Design Pattern Use Cases
Currently I can think of the following 2 use cases where you will need to use the “Intelligent Plugin” design pattern.
Reasoning over and respond to a dynamic user prompt (Query)
In this use case your plugin will respond to a dynamic user prompt grounded on your LOB domain knowledge.
Multi-Model Integration
In this use case your plugin will utilize your custom/fine-tune model to respond to a dynamic user prompt grounded on your LOB domain knowledge.
Summary
The "Intelligent Plugin" design pattern offers a powerful approach to enhancing Copilot for Microsoft 365 plugins with intelligence, leveraging the capabilities of a Language Model (LLM) supported by a Retrieval-Augmented Generation (RAG) workflow. By adopting this pattern, developers can empower plugins to reason over entire user queries, tapping into domain knowledge and delivering dynamic responses beyond utilizing straightforward downstream API calls and multi-model integration.
Option 1 Pro-Code provides detailed control but demands substantial development effort. Option 2 Low-Code, though simpler, faces challenges like data retrieval limitations from Dataverse and lacks a built-in connector for Azure OpenAI Service.
Despite these challenges, both options signify a promising pattern for Copilot plugins design, enabling richer interactions and driving innovation within the Copilot for Microsoft 365 ecosystem.
Sharing Is Caring!
Aspiring Data Scientist | AI Enthusiast | RGU
3 个月How can we use Copilot M365 Plugins to retrieve documents with certain metadata or tagging or format within an organization?
Aspiring Data Scientist | AI Enthusiast | RGU
3 个月Very helpful information, Thanks Mahmoud Hassan
Power Platform , Engagement Director at HCLTech- 2X Fast track recognized Solution Architect, User group leader - Southwest Uk
5 个月Insightful!
Technical Evangelist, Public speaker, Microsoft MVP and 365 Expert, Modern Workplace Expert, help businesses grow trough partnerships
5 个月thank you for sharing!