登录查看更多内容

点击“继续加入或登录”，即表示您同意遵守领英的《用户协议》、《隐私政策》及《Cookie 政策》。

AI for IT: Named Entity Recognition from Unstructured Texts in IT Services

Naga (Arun) Ayachitula

Vice President, AIOps Engineering (Data/Analytics & AI/ML) and Distinguished Engineer

发布日期: 2020年4月8日

+ 关注

Arun Ayachitula & Rohit Khandekar

Identifying and classifying Named Entities, or Named Entity Recognition (NER), from unstructured texts, is a central problem in natural language processing and has several applications like classification, intent analysis, etc.

Here we discuss the NER problem in the context of analyzing unstructured texts arising in IT services domain. Consider an IT service provider managing an IT infrastructure comprising of several software and hardware components. It encounters large volumes of unstructured texts from incident tickets raised by users or monitoring systems. Consider, for example, the following incident report/ticket:

“Logging in Rational Business Developer, version 12.3.23 release 34, is set to ‘verbose’ level – these logs are filling up the temp space on H01NAXPROD.”

There are several “named” entities mentioned in this ticket:

Business Application: Rational Business Developer

Application version: version 12.3.23 release 34

Host: H01NAXPROD

Identifying such named entities helps analyzing the intent of the incident ticket, routing it to the right mitigation teams for resolution, and potentially avoiding such incidents in the future.

The named entity types we focus in our study include Hostnames, IP addresses, Business applications (and their versions), Middleware applications (and their versions), Operating systems (and their versions).

Method

Our approach for addressing this problem has two steps:

1. Use IT infrastructure topology and NIST software product catalog to identify named entities that could potentially be present in the incident ticket data. Create a labeled dataset with identified named entities.

2. Train a machine learning model on the labeled dataset to recognize named entities from unseen examples. To this end, we fine-tune Google’s language model called BERT (Bidirectional Encoder Representations from Transformers).

Dictionary based labeled data generation

To train any machine learning algorithm, we typically need a large labeled dataset containing a lot of examples of verified named entities and how they appear in the running text relevant to the use case.

To this end, we use the IT infrastructure topology data for each stake holder or client. For each client, we try to get a comprehensive list of hosts and IP addresses in its IT infrastructure. Similarly, we get a fairly comprehensive list of software products and versions thereof from National Vulnerability Database from NIST.

For each incident ticket, we analyze unstructured texts present in the abstract, description and resolution. We split each sentence into tokens and find mentions of known entities like hosts, IP addresses, applications and OS names. Thus, we create a labeled dataset to be used to train a machine learning model.

Training a machine learning model

While the static dictionary-based approach may give fair coverage for identifying named entities, it may never be comprehensive, primarily because both IT infrastructure and the set of software products are highly evolving. New hosts get added to the IT infrastructure and new software products get introduced on a daily basis; and keeping the dictionaries up to date is almost impractical.

We take an approach of training a model that “learns” the patterns related to how named entities appear in the text and predicts potential named entities not seem before.

Given several examples like the one above, the model is expected to learn linguistic patterns in which various different types of named entities appear and be able to generalize them to detect unseen named entities of similar types.

Bidirectional Encoder Representations from Transformers (BERT) for Sequence Labeling Task

We model the NER problem as a sequence labeling problem. In machine learning, sequence labeling is a type of pattern recognition task that involves the algorithmic assignment of a categorical label to each member of a sequence of observed values. We fine tune Google’s BERT language model to train a model for the sequence labeling task. BERT is a multi-layer bidirectional transform encoder. Transforms are multi-head attention mechanisms (encoders and decoders) that are used in sequence dependency NLP tasks and have been proven to be more effective than Hidden Markov Model (HMM) and Conditional Random Fields (CRF) as well as other neural-network-based approaches like Recurrent Neural Networks (RNN) and LSTMs.

To train a model, we first create the dataset in so-called IOB format as shown in the figure below. Each token in each sentence is labeled with either ‘O’ (other), ‘B-XXXX’ (beginning token for entity of type ‘XXXX’), ‘I-XXXX’ (inside token for entity type ‘XXXX’).

Out of 1 million sentences from the corpus of incident tickets, 241K sentences were labeled with at least one named entity. We train a model on GPUs with 20 epochs in about 5 days. The model obtained an accuracy of over 99% in the sequence labeling task. More importantly, it was able to identify several unseen named entities. The Venn diagram given below summarizes how the sets of named entities labeled or identified by different methods intersect with each other.

Summary

We addressed the Named Entity Recognition (NER) problem arising on IT service incident tickets. Starting with the known examples hosts and IP addresses from IT infrastructure topology catalog and examples of software applications from NIST database, we created a labeled dataset of named entities. This dataset has a rich variety of how these named entities are mentioned in unstructured IT texts. We then model the NER problem as a sequence-labeling problem and train a deep-learning based model by fine-tuning BERT language model. The model was able to detect several new entities based on linguistic features associated with the underlying entity types.

Acknowledgements: This article is based on a joint work with Salim Roukos, Radu Florian, Parul Awasthy & Navaneeth Vadapalli.

Surjit Singh

Python Backend Developer (TDD) / AIML

2 年

Great Article

Nirandika Wanigasekara, Ph.D.

AI Practitioner at IBM

4 年

Great arcticle. Is the training dataset available publicly?

Venkataraman Hariharan

Go-To-Market Sales Enablement Leader | Expert in Customer Growth, Strategic Leadership, and AI-Driven Innovation | 19+ Years in IT & Transformation

4 年

Well written

查看更多评论

要查看或添加评论，请登录

Naga (Arun) Ayachitula的更多文章

How Intel Gaudi-2 Optimizations Drive Significant Cost Savings

2025年1月31日

How Intel Gaudi-2 Optimizations Drive Significant Cost Savings

Upendra Sharma*, Arun Ayachitula Generative AI is transforming industries, but its soaring costs demand more innovative…

2 条评论
Cost Efficiency in IT Enterprises: Leveraging Quantization for Generative AI

2024年6月2日

Cost Efficiency in IT Enterprises: Leveraging Quantization for Generative AI

Upendra Sharma, Arun Ayachitula Generative AI models like GPT-4 require powerful GPUs for training and inference…

2 条评论
AIOps experiments on the NVIDIA GH200 Grace Hopper?

2024年4月13日

AIOps experiments on the NVIDIA GH200 Grace Hopper?

Balakrishnan Saravanan Kesavan, Upendra Sharma, and Arun Ayachitula Numerous intricate Natural Language Processing…
Integrated AIOps - IT Service Health

2024年3月24日

Integrated AIOps - IT Service Health

Upendra Sharma, Girish Mohite & Arun Ayachitula Service Health is a multifaceted health monitoring system for IT…

1 条评论
Text Similarity

2023年12月30日

Text Similarity

Upendra Sharma, Arun Ayachitula 1. Motivation While adept at storing factual knowledge and excelling in NLP tasks…
AIOps: Forecasting with data drift considerations

2023年12月23日

AIOps: Forecasting with data drift considerations

Balakrishnan Saravanan Kesavan, Upendra Sharma, and Arun Ayachitula It is crucial to monitor and forecast IT…
AIOps: Time series analysis – Forecasting

2023年12月22日

AIOps: Time series analysis – Forecasting

Balakrishnan Saravanan Kesavan, Upendra Sharma and Arun Ayachitula Measures by Business Objectives (MBOs) in IT Service…
AIOps: Interpretability using Disjunctive Normal Form

2023年2月15日

AIOps: Interpretability using Disjunctive Normal Form

Arun Ayachitula & Upendra Sharma Interpretability and Explainability of AI/ML models have been differentiated in the…
AIOps - Explainability using pertinent positives

2023年2月12日

AIOps - Explainability using pertinent positives

Arun Ayachitula, Rohit Khandekar & Upendra Sharma Classifier Explainability is a Broad AI practice to explain the…

1 条评论
AIOps: Biases and Fairness in AI/ML

2023年1月26日

AIOps: Biases and Fairness in AI/ML

Arun Ayachitula, Rohit Khandekar & Upendra Sharma Fairness in AI has received much attention recently due to ethical…

See all articles

Method

Dictionary based labeled data generation

Training a machine learning model

Bidirectional Encoder Representations from Transformers (BERT) for Sequence Labeling Task

Summary

Naga (Arun) Ayachitula的更多文章

How Intel Gaudi-2 Optimizations Drive Significant Cost Savings

Cost Efficiency in IT Enterprises: Leveraging Quantization for Generative AI

AIOps experiments on the NVIDIA GH200 Grace Hopper?

Integrated AIOps - IT Service Health

Text Similarity

AIOps: Forecasting with data drift considerations

AIOps: Time series analysis – Forecasting

AIOps: Interpretability using Disjunctive Normal Form

AIOps - Explainability using pertinent positives

AIOps: Biases and Fairness in AI/ML

社区洞察