登录查看更多内容

AIOps - Explainability using pertinent positives

Naga (Arun) Ayachitula

Vice President, AIOps Engineering (Data/Analytics & AI/ML) and Distinguished Engineer

发布日期: 2023年2月12日

Arun Ayachitula, Rohit Khandekar & Upendra Sharma

Classifier Explainability is a Broad AI practice to explain the classification decisions and establish ‘Trust in AI’. Indeed, the Explainability has become one of the important factors for evaluating machine learning models and gain the trust of IT user community for adoption at a scale.

What is Classifier Explainability?

Classifier Explainability refers to the ability to provide insights into how the classifier decision process works. Broadly, there are two types of classifier Explainability:

Global Explainability deals with insights into how a classifier is working at a corpus level. This includes, for example, determining

a.???if the classifier is biased w.r.t. some classes, ?

b.?if the classifier is being confused between any pairs of classes,

c.??what feature sets are most important for given classes.

Global Explainability uses the confusion matrix, coefficient matrix and shows the pertinent positives and pertinent negatives at a classifier model level.

Local Explainability deals with insights into how a classifier is working at an instance level. This includes, for example, determining

a.???how the classifier reached its decision for a given input,

b.???what features were most important for this decision,

c.????what missing features, had they been present, would have changed the decision of the classifier.

Good explanations are important for helping end-users develop trust in the classifier, for developing new insights about the underlying domain, or for improving the classifier itself. One explanation may not suit everyone, however. Indeed, different users may require different types of explanations based on their needs and levels of sophistication.

Explainability with less data

Here we focus on local Explainability for small IT texts. Small texts like ticket abstracts or descriptions, event summaries are prevalent in IT domain and provide valuable insights into the overall health of the IT infrastructure. Furthermore, small texts are easier to deal with than larger texts since they have a smaller context to understand. Below, we briefly describe how we compute explanations while classifying such small IT texts.

In order to explain why a decision was made by the classifier instead of another, we compute explanations. Our local explanations:

identify key input features responsible for the resulting classification – such features are called “pertinent positives”,
are sparse, i.e., contain only a small number of features, and
are easy for humans to interpret.

Computing sparse pertinent positive features

In an industry-scale text classification task, the dimensionality of the full feature space can easily be in millions. Fortunately, we do not have the curse of dimensionality while dealing with a small text, since we only must deal with a small number of features present in that text.

To simplify the exposition, let us assume that we are using a linear classifier, e.g., SVM or Passive-Aggressive classifier, with unigrams as features, without any TFIDF transformation or without any class-probability calibration. Thus, the classifier uses a k-by-n coefficient matrix C where k and n denote the number of classes and features respectively and given a binary input feature vector x, outputs class

领英推荐

This AI newsletter is all you need #93

Towards AI 11 个月前

Enhancing LLM Accuracy with Retrieval Augmented…

Mark Hinkle 10 个月前

How to Provide Data to Your Gen AI Application

Dr. Rabi Prasad Padhy 5 个月前

The problem of computing a pertinent positive explanation can be formulated as computing a binary vector x’ such that:

1.???x’ is dominated by x, i.e., it only has a subset of features from x,

2.???x’ is classified into the same class j,

3.???x’ is sparse, i.e., it has as few features as possible, and

4.???x’ has as large “distance function gap”, i.e., it minimizes max_i (C_i x’ – C_j x’) where the maximum is taken over all i not equal to j.

Since the maximum of linear functions is a convex function, the above problem can be cast as a convex optimization problem. We impose L1-regularization to get sparsity and L2-regularization to reduce the solution magnitude even further. Such a problem can be solved by using standard techniques from convex optimization, including gradient-descent and shrinkage-thresholding algorithms.

Iterative Shrinkage/Thresholding Algorithms (ISTA) and their applications to computing Pertinent Positive features in classification

The Iterative Shrinkage/Thresholding Algorithms (ISTA) are used to compute sparse solutions to inverse linear problems. A typical example of an inverse linear problem is linear regression.

Consider a classification problem, e.g., the text classification problem. We use a Passive-Aggressive Algorithm for such a text classification problem in the IT ticket management domain. Consider a ticket T that gets classified into a class C (e.g., "disk-handler") using this algorithm. It is often important to show "evidence" of the inner working of the classifier and "explain" why the ticket T got classified into class C. The pertinent positive in a ticket like T are a small subset of features of T that are responsible to its classification into class C. Such a set of features provide a good explanation of the inner working of the classifier.

We formulate the problem of finding positive pertinent features for the text classification problem as a sparse inverse linear problem. We customize and simplify the ISTA algorithm to make it very efficient for this use case. This customization is non-trivial and cannot be easily derived from a general ISTA algorithm. It has to take into account the specific problem formulation that PAC uses internally and using its structural properties to implement the iterative thresholding step efficiently.

Specific unique contributions of this work:

1.????Formulation of the problem of computing pertinent positives for the IT ticket classification problem to identify token in the ticket description that are explanations of the classifier behavior

2.????Formulation of problem of computing pertinent positives for the linear (passive-aggressive) classifier as an L1 regularization problem

3.????Using ISTA algorithm to solve the above mentioned L1 regularization problem

Choice of algorithm parameters like step size and thresholds for shrinkage step

4.????Steps to convert the output of the ISTA algorithm to identify the pertinent positive features (tokens) by using a magnitude significance threshold

AIOps Visualization – a view

Explainability: The Local Explainability is computed by identifying the pertinent positive features from a given ticket using various AI/ML Natural Language Processing Techniques. The Pertinent Positives from a ticket are highlighted below in the ticket description.

Example Ticket Summary: N1VL-PA-APB169_Guest File System:/var|Partition Utilization_VirtualMachine ae81283e-dbac-4bc4-b780-bd37b07d3446/One or more virtual machine guest file systems are running out of?disk?space

Explainability Identified Pertinent Positive Features:

disk, space

Amit Dhurandhar

Principal Research Scientist at IBM TJ Watson

2 年

Arun, glad that the idea had practical value for you. Nice implementation of the idea!

要查看或添加评论，请登录

Naga (Arun) Ayachitula的更多文章

How Intel Gaudi-2 Optimizations Drive Significant Cost Savings

2025年1月31日

How Intel Gaudi-2 Optimizations Drive Significant Cost Savings

Upendra Sharma*, Arun Ayachitula Generative AI is transforming industries, but its soaring costs demand more innovative…

2 条评论
Cost Efficiency in IT Enterprises: Leveraging Quantization for Generative AI

2024年6月2日

Cost Efficiency in IT Enterprises: Leveraging Quantization for Generative AI

Upendra Sharma, Arun Ayachitula Generative AI models like GPT-4 require powerful GPUs for training and inference…

2 条评论
AIOps experiments on the NVIDIA GH200 Grace Hopper?

2024年4月13日

AIOps experiments on the NVIDIA GH200 Grace Hopper?

Balakrishnan Saravanan Kesavan, Upendra Sharma, and Arun Ayachitula Numerous intricate Natural Language Processing…
Integrated AIOps - IT Service Health

2024年3月24日

Integrated AIOps - IT Service Health

Upendra Sharma, Girish Mohite & Arun Ayachitula Service Health is a multifaceted health monitoring system for IT…

1 条评论
Text Similarity

2023年12月30日

Text Similarity

Upendra Sharma, Arun Ayachitula 1. Motivation While adept at storing factual knowledge and excelling in NLP tasks…
AIOps: Forecasting with data drift considerations

2023年12月23日

AIOps: Forecasting with data drift considerations

Balakrishnan Saravanan Kesavan, Upendra Sharma, and Arun Ayachitula It is crucial to monitor and forecast IT…
AIOps: Time series analysis – Forecasting

2023年12月22日

AIOps: Time series analysis – Forecasting

Balakrishnan Saravanan Kesavan, Upendra Sharma and Arun Ayachitula Measures by Business Objectives (MBOs) in IT Service…
AIOps: Interpretability using Disjunctive Normal Form

2023年2月15日

AIOps: Interpretability using Disjunctive Normal Form

Arun Ayachitula & Upendra Sharma Interpretability and Explainability of AI/ML models have been differentiated in the…
AIOps: Biases and Fairness in AI/ML

2023年1月26日

AIOps: Biases and Fairness in AI/ML

Arun Ayachitula, Rohit Khandekar & Upendra Sharma Fairness in AI has received much attention recently due to ethical…
AIOps - Multimodal Correlation & Prescriptive insights from analyzing multiple IT datasets

2023年1月20日

AIOps - Multimodal Correlation & Prescriptive insights from analyzing multiple IT datasets

Arun Ayachitula, Rohit Khandekar & Upendra Sharma We will explore capabilities in AIOps from correlating time-series…

3 条评论

See all articles

AIOps - Explainability using pertinent positives

Naga (Arun) Ayachitula

Vice President, AIOps Engineering (Data/Analytics & AI/ML) and Distinguished Engineer

What is Classifier Explainability?

Explainability with less data

领英推荐

Iterative Shrinkage/Thresholding Algorithms (ISTA) and their applications to computing Pertinent Positive features in classification

AIOps Visualization – a view

Naga (Arun) Ayachitula的更多文章

社区洞察

其他会员也浏览了

Power of AI with Gemma: A New Generation of Open-Source Machine Learning Models

Decoding closed box Models with SHAP

Getting started with AI – how much data do you need?

Top 14 No-Code Machine Learning Platforms To Use in 202

Dealing with the Intrinsic Instability and Dual Nature of AI Models: The Promise of MLOps

Why So Many Organizations Are Getting AI and Machine Learning Wrong

Data Quality Is Essential for AI and Machine Learning Success

Generative AI’s Promise in a Siloed Data World

The AI Revolution in Business: Predicting Success with Machine Learning Models

Essential Data Structures and Algorithms for AI: Explained with Real-Life Examples

What is Classifier Explainability?

Explainability with less data

领英推荐

Iterative Shrinkage/Thresholding Algorithms (ISTA) and their applications to computing Pertinent Positive features in classification

AIOps Visualization – a view

Naga (Arun) Ayachitula的更多文章

How Intel Gaudi-2 Optimizations Drive Significant Cost Savings

Cost Efficiency in IT Enterprises: Leveraging Quantization for Generative AI

AIOps experiments on the NVIDIA GH200 Grace Hopper?

Integrated AIOps - IT Service Health

Text Similarity

AIOps: Forecasting with data drift considerations

AIOps: Time series analysis – Forecasting

AIOps: Interpretability using Disjunctive Normal Form

AIOps: Biases and Fairness in AI/ML

AIOps - Multimodal Correlation & Prescriptive insights from analyzing multiple IT datasets

社区洞察

其他会员也浏览了

Power of AI with Gemma: A New Generation of Open-Source Machine Learning Models

Decoding closed box Models with SHAP

Getting started with AI – how much data do you need?

Top 14 No-Code Machine Learning Platforms To Use in 202

Dealing with the Intrinsic Instability and Dual Nature of AI Models: The Promise of MLOps

Why So Many Organizations Are Getting AI and Machine Learning Wrong

Data Quality Is Essential for AI and Machine Learning Success

Generative AI’s Promise in a Siloed Data World

The AI Revolution in Business: Predicting Success with Machine Learning Models

Essential Data Structures and Algorithms for AI: Explained with Real-Life Examples