Monitor Azure OpenAI instances with Azure Workbooks
? Introduction
Recently, while working on setting up Azure Open AI instances, have stated analyzing the Dianostics logs generated by Azure Open AI data set more and more.
Eventually, based on queries built are organized in a workbook to gain a better insight on the dataset.
? Setup & Design
Please enable Azure Diagnostics Logs for the Azure Open AI instance to ensure the workbook shows results.
Azure Open AI Monitoring Workbook is spread across various section with dedicated queries built in considering the current subscription and a configurable TimeRange Filter.
? How to find this workbook
This workbook is approved and blend GitHub Repository.
? Source Code
The Source code for the workbook is available is different places as different option:
A. Microsoft Sentinel Sentinel GitHub Code Repository
After the pull request validation & approval currently the source code is available in Microsoft Sentinel GitHub repository
B. GitHub Query Store Repository
As a part of ideation & development, all individual queries are documented in the
below GitHub repository
? See in Action -> Deploy to Azure
You can also deploy the workbook directly from here:
Here is the list of queries
Open AI Operations
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| summarize count() by OperationName
Open AI Operations
let data = (AzureDiagnostics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| summarize count() by ResultSignature
| extend Result = case (ResultSignature == 404, "Not found", ResultSignature == 200, "Success", "Unknown"));
let data1= (AzureDiagnostics
| summarize make_list(DurationMs) by ResultSignature);
| join kind=inner
on ResultSignature
Open AI Instances Activity
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| summarize count() by Resource
Open AI Total Duration
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| where Category == "RequestResponse"
| summarize sum(DurationMs) by Resource
Open AI Resource Types
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| summarize count() by ResourceType
Request Trend
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| summarize ['Duration in Miliseconds'] = make_list(DurationMs) by Resource
Open AI Activity Timeline
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
|summarize count() by bin(TimeGenerated, {Timespan:grain} ), Resource
Open AI Deviation of Activities
let baseline = toscalar(AzureDiagnostics
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
|summarize count() by bin(TimeGenerated, 15m)
| summarize avg(count_));
|summarize count() by bin(TimeGenerated, 15m)
| extend deviation = count_- baseline/ baseline
| project-away count_
Open AI Request Response
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| where Category == "RequestResponse"
| project OperationName, DurationMs, CallerIPAddress, ResourceGroup, ResourceId
Creating a completion for the chat message
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| where OperationName == "ChatCompletions_Create"
| project TimeGenerated,CallerIPAddress, ResourceId, DurationMs
New Open AI Deployment
| where ActivitySubstatusValue == "Created"
| extend Role = parse_json(tostring(parse_json(Authorization).evidence)).role
| extend GivenName = parse_json(Claims).name
| project CategoryValue, CallerIpAddress, Caller, GivenName, Role, ActivityStatusValue
Azure Open AI service failures
| join AzureActivity on ResourceGroup
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| where ActivityStatusValue == "Failure"
| extend name_ = tostring(parse_json(Claims).name)
| extend UPN = Caller
| project ActivitySubstatus, ActivitySubstatusValue, HTTPRequest, Resource, OperationName, name_, UPN, CallerIpAddress
Operations through Request Response
| where ResourceProvider == "MICROSOFT.COGNITIVESERVICES"
| where Category == "RequestResponse"
| project OperationName, DurationMs, CallerIPAddress, ResourceGroup, Resource
ChatGPT usage from Logic App API
| join AzureActivity on ResourceGroup
| where Resource contains "GPT3"
| where ResourceProvider == "MICROSOFT.LOGIC"
| where isnotempty(Caller)
| where Caller hassuffix "com"
| extend PlaybookName = resource_workflowName_s
| extend Action = Resource
| distinct Caller, PlaybookName, Action, CallerIpAddress
Azure OpenAI Content Safety Service: Detect analyzing text or images
| where parse_json(properties_s).apiName == "Content Safety Service"
| where OperationName == "Analyze Image" or OperationName == "Analyze Text"
| distinct CallerIPAddress
User information using Azure OpenAI Studio
| where AppDisplayName == "Azure OpenAI Studio"
| extend parse_json(LocationDetails).city
| extend parse_json(LocationDetails).countryOrRegion
| extend parse_json(LocationDetails).state
| extend parse_json(tostring(parse_json(LocationDetails).geoCoordinates)).latitude
| extend parse_json(tostring(parse_json(LocationDetails).geoCoordinates)).longitude
| extend parse_json(DeviceDetail).browser
| extend parse_json(DeviceDetail).displayName
| extend parse_json(DeviceDetail).operatingSystem
| extend parse_json(DeviceDetail).trustType
| project UserDisplayName, UserPrincipalName, IPAddress, LocationDetails_city, LocationDetails_state, LocationDetails_countryOrRegion, LocationDetails_geoCoordinates_latitude, LocationDetails_geoCoordinates_longitude, DeviceDetail_browser, DeviceDetail_displayName, DeviceDetail_operatingSystem, DeviceDetail_trustType
? Conclusion
This is a descriptive detail about the workbook for Azure Open AI monitoring is available as a part of Microsoft Sentinel GitHub repository.
Feel free to extend with your thoughts!!!