课程: Hands-On Natural Language Processing
What is named entity recognition (NER)?
- Named entity recognition is a natural language processing technique that locates named entities in a structured text data and classifies entities into predefined categories. It is also called entity extraction, entity identification or entity chunking. The algorithm is able to recognize named entities, searchers, people's and companies names, addresses, dates, expression quantities, monetary values and percentages. And the machine learning used for named entity recognition can either be supervised, where the training data is labeled, or unsupervised, where the training data is not labeled. In practice, the supervised machine learning is the most popular approach. The central principle of named entity recognition is to understand what's in the text, retrieving collect important information for storage into databases. Some of the most popular frameworks are spaCy, scispaCy, which is like spaCy, but it focuses on scientific and clinical documents. Then there is flairNLP, AllenNLP, Stanza and Stafford NER. All those frameworks are based on Python, except for Stafford NER, which is based on Java. Now, there are many practical use cases of named entity recognition. In fact, you may already have come across them and may not even know it. You can apply named entity recognition to review customer feedback and detect recurring problems in a certain location. For example, you can automatically categorize customer support tickets by product name or type to route the ticket to the appropriate agent. Now, if you watch movies, listen to music or browse products online, named entity recognition systems are probably improving the efficiency of the recommendation. They can also help recruiters save a lot of time when reviewing hundreds of resumes. The algorithm can extract relevant information about candidates, such as their name, email, degrees, work experience and so on automatically. And then there is plain text cataloging where the type of texts is determined based on valid recognized entities. So as you can see, named entity recognition techniques keep those valuable tools to speed up our work to better evaluate large amounts of texture data to categorize items and even to create item recommendation.
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。