Automate ITSM Workflows and Accelerate IT Resolutions Using AI/ML
Vikas Srivastava
Principal Technical Marketing Engineer @ Zscaler | CCIE, Cybersecurity
At Zscaler Zenith Live, I had the pleasure of talking to our customers about how AI can revolutionize IT service management (ITSM) by detecting issues early and initiating automated root cause analysis, helping reduce Mean Time to Detect (MTTD) and Mean Time to Repair (MTTR). In this article, we'll recap what was covered during my session.
We also trained hundreds of people in the ZDX workshop, and if you're interested in ZDX Certifications, there's more on that later. Read on!
This article is organized into two main parts:
If you are new to ZDX, here’s a quick summary;
Application performance and network issues are time-consuming tasks for Tier 1 to Tier 4 service desk and IT teams troubleshooting user experience. They’re further hampered by the fact that legacy monitoring tools don’t share context and require manual correlation. Beyond that, major skill gaps lead to unnecessary escalations, even for simple problems/tickets that could have been resolved early and saved from being pushed to higher tiers.
Zscaler Digital Experience (ZDX) addresses these challenges by providing a multitenant, cloud-based monitoring platform that probes, benchmarks, and measures digital experiences for every user in the organization.?
By leveraging the same Zscaler Client Connector, which many customers have already deployed, ZDX performs synthetic probing to desired SaaS applications or internet-based services, offering critical insights into the user experience. This allows IT support teams to triage, escalate, and close tickets more efficiently significantly reducing the burden on IT support.
Automate ITSM Workflows
Switching gears; ZDX has built in workflows that triggers incident creation when an anomaly is detected. It can do this using webhooks, email, or by leveraging the ServiceNow Plugin which is available in the ServiceNow AppStore.??
Here are some of the options which are available to customers to integrate ZDX into their ITSM systems.
Let’s take a look at each of them in detail:
?
ZDX ServiceNow Integration:
ZDX provides out of the box integration with ServiceNow , the plugin enabling this is available in the ServiceNow Marketplace. Once configured; it provides the following capabilities:?
Here’s a video I had recorded if you want to get into how the integration is done: https://www.youtube.com/watch?v=5DMrNB8jCWY
Once the setup is complete ZDX Alerts start flowing in as Incidents into ServiceNow.?
Bringing in the User Level Details into the ServiceNow ticket, the integration of ZDX with ServiceNow not only allows alerts to flow seamlessly into ServiceNow as incidents but also brings in user-level details.?
When a ServiceNow incident is opened for a user, the user's ZDX experience score is populated within the incident. This enables ServiceDesk technicians to provide a white-glove experience by having immediate insight into the current state of the user's experience.
Additionally, ServiceDesk technicians can run an on-demand troubleshooting session. This session captures granular details and allows them to analyze, evaluate, and troubleshoot issues for a specific user, device, or application.
Zscaler Workflow Automation
We've covered the ServiceNow integration, but what if you need more control over which users, departments, or even ServiceNow instances handle specific tickets?
To address these challenges and provide enhanced control, we are launching Zscaler Workflow Automation, currently in beta. This new feature allows for precise configuration and management of ticket routing, ensuring that the right teams handle the right issues efficiently. Let’s dive into it.
Once Workflow Automation is enabled for your tenant, you can define multiple ServiceNow destinations and create routing rules for your alerts. These rules ensure that alerts flow into ServiceNow based on specified criteria.
You can configure the system to trigger a workflow for a specific alert type, such as Network, if it is classified as high severity.
You can define that specific Alert Type (say Network) and if its high severity it should trigger this workflow.?
This workflow will then assign the ticket to the appropriate ServiceNow tenant or user according to your predefined configurations. This ensures that critical issues are routed to the right team or individual for prompt attention and resolution, enhancing the efficiency and responsiveness of your IT support operations.
Integrations leveraging ZDX Open APIs
During the session, we also explored how ZDX integrates seamlessly with advanced analytics tools like Splunk, Power BI, and Moogsoft. These integrations offer significant flexibility and enhanced analytics capabilities, allowing you to gain deeper insights into your IT environment.
Please note that these options are not intended to export all data from ZDX into a third-party data analytics solution. Instead, they enable you to export the most relevant data for your specific use case, within the constraints of API limits.
Leveraging the ZDX API for Custom Integrations:
We have a Jupyter Notebook on our GitHub repository to the ZDX APIs as well as examples of some of the above-mentioned use cases.?
Innovations in AI/ML for IT Troubleshooting
We explored the latest innovations in machine learning (ML) and artificial intelligence (AI) that assist with troubleshooting and detecting issues in IT environments. These technologies can accelerate IT resolutions by providing deeper insights and automated solutions.
领英推荐
ZDX has been leveraging AI/ML in the various feature sets for ZDX from very early on, we started with Automated Root Cause Analysis as our first ML based feature , here are some of the ZDX features which leverages AI/ML:?
Let’s look at each of these.?
Automated Root Cause Analysis;
Now let’s take a look at how IT admins can leverage AI/ML to get to the root cause of the issue quickly: ZDX can swiftly identify the root cause of user experience issues with its new AI-powered root cause analysis capability.
The Automate Root Cause Analysis feature is a powerful tool that assists both ServiceDesk and Tier 4 teams in quickly pinpointing the source of issues.
When a user's score falls into the poor category, the "Analyze Score" button triggers a correlation of data points (CPU, Memory, Network, Wi-Fi, DNS, etc.) to determine what caused the degradation in user experience.
It then provides a verdict with detailed insights into the root cause.
The analysis table provides key details for a specific date and time in the graph:
With ZDX, you can compare application scores to understand why they might vary over time. Score comparisons can reveal why a current score differs significantly from a previous one. This feature utilizes web, device, and Cloud Path metrics to determine differences in scoring.
To start your comparison, select a point within the ZDX Score Over Time graph and choose from the "Compare to" drop-down menu:
In the above example, we demonstrate the differences in network statistics between the compared point and the analyzed point. It provides detailed information to the ServiceDesk admin on what has changed.
Incident Dashboard
So far, we’ve seen how to trigger the root cause analysis on a degraded score, but since ZDX already has millions of telemetry points from the end user experience perspective, we went a step ahead and developed our Incident Dashboard which correlates these myriad data points and bubbles up deep-rooted issues in your environment which you may not be yet aware of.??
The ZDX Incidents Dashboard provides a comprehensive view of IT incidents impacting user device performance, categorized into Wi-Fi, Last Mile ISP, ZIA/ZPA Public Service Edge, and Application and more. It uses AI/ML to detect and analyze incidents, offering real-time monitoring and detailed metrics, such as the number of incidents, impacted users, and their geographic locations. Filters allow you to refine the data by geolocation, type, and time range. Key features include incident analysis over time, visualization of incidents on a map, and detailed incident insights, enhancing IT operations and service quality.
Key components:
A common question is about the criteria or thresholds that trigger incidents on the dashboard. While much of this is driven by machine learning, here are some high-level details on the thresholds and data monitored over time to trigger an incident.
Self Service
Keeping the same innovations in AI, we next looked at resolving the issues at the user level itself? before it goes and impacts end user experience. With self service, we provide the user with a gentle nudge (notification) that there could be a CPU or a Wi-Fi Issue which might be impacting his/her experience. Self Service can help users identify the root cause of issues related to CPU usage and Wi-Fi access, allowing users to investigate potential solutions without the need to contact customer support. When enabled for your users, Self Service provides notifications when issues are detected and need attention. Each notification contains a brief diagnosis and recommendation that might resolve the CPU or Wi-Fi issue.
Alright now we know it took a lot of process but which one?
ZDX Copilot
ZDX Copilot offers versatile capabilities for various IT functions:
ZDX Copilot aids IT employees across various functions in upskilling, automating tasks, gaining digital experience insights, and performing in-depth performance analysis. By leveraging knowledge from over 500 trillion daily metrics across devices, networks, and applications, observed by the world’s largest security cloud, ZDX with Copilot helps your teams significantly improve efficiency and collaboration across IT operations, service desks, and security. Cover the frequently asked questions about ZDX Copilot – Frozen LLM .?
And to conclude, let’s look at the financial benefits of implementing ZDX
A properly deployed ZDX solution offers significant financial advantages. For a company with 45,000 users, we projected annual cost savings of approximately $7.4 million, driven by:
Conclusion:
Incorporating AI and ML into ITSM workflows not only enhances efficiency but also drives substantial cost savings. ZDX’s comprehensive approach addresses common IT challenges, providing a robust solution for modern IT environments.
For more details on our analysis and to see how we reached these savings, check out the guide.
If you’ve made it this far, thank you for reading! Please reach out to your account team if you would like to hear more about the features discussed in this article and ways on how ZDX can enhance your end user and employee experience.
THANK YOU !
Vikas Srivastava
IT Procurement Leader @ American Honda | Executive MBA | MS in Legal Studies.
1 个月Congrats Vikas, great going. :)
Principal Technical Marketing Engineer @ Zscaler | CCIE, Cybersecurity
1 个月https://www.zscaler.com/blogs/product-insights/automate-itsm-workflows-and-accelerate-it-resolutions-using-ai
Business Solutions Specialist(AVP) @ Coforge- Financial Services & Consulting | IIMK | Payments??| ISO 20022 Migration | SAFe Agilist 5.1? ?? CSPO? ?? Six Sigma Black Belt | xEY | xIB | xYahòo! | CIPSP | CBPR+ | SEPA
1 个月Insightful :)