The practice of NLP
Sumanto Mukherjee, Practice Lead, Alchemy Techsol India

The practice of NLP


As we are getting more and more familiar with the AI/ML approaches and started using it in various industrial application, the interest for this is growing further. We are observing the increasing interest in the customer behavioral mapping. May it be an airlines, e-com, other SAAS provider every where people are interested to what customers think about them. Well, thousands tweets, FB reviews, Online product reviews , feedback inputs etc are not possible to handle in manual ways. Hence companies are measuring “Sentimental Analysis” by adopting NLP algorithm. Let me give an example – An airlines can use the feedback inputs or twitter to measure the occurrence of word – “Pathetic”/”Bad” etc associated with “Food”/ “ Flight Schedule” / “Crew” / “Luggage Handling” etc and can take corrective action. The practice of using a BOT ?on the landing page is further useful to capture the customer inputs as customers are sometime allergic to open grievance window or tweet their experience. We at Alchemy Techsol generally propose our clientless to start with hiring the below skills:

Javascript/HTML/CSS , Data Scientists with Python ( knowledge with – Stemmers, Tokenizers, Part of Speech, Lemmatization and N-Grams etc), There are several open source tools available for this. Let me cite the below process- which is the simplest way to understand the NLP functionality by business leaders.

#nlp, #nlppractitioner , #nlpmasterpractitioner , #nlppw2022 , #AlchemyTechsol , #itrecruitment

Step 1: Part of speech tagging

The process involves classifying each word at a grammatical level and identifying which words are nouns, verbs, adjectives, adverbs etc. Also it identifies the objects. There are plug and play tools available for this. The tool identifies conjunctions and subordinate clauses and that analyzes the true meaning of the text. Generally, the tools use its own speech tagger for each language used. Part of speech tagging is done by first accumulating a massive corpus of pre-tagged text. With this information, the tool trains a part of speech tagger and relies on probabilities to determine the correct part of speech for a given word in a given context. In order to achieve the accuracy of sentiment, a very finely tuned and well trained par speech tagger is required and nowadays many open source tools are doing that job. The process involves classifying each word at a grammatical level and identifying which words are nouns, verbs, adjectives, adverbs etc. Also it identifies the objects. There are plug and play tools available for this. The tool identifies conjunctions and subordinate clauses and that analyzes the true meaning of the text.

Generally, the tools use its own speech tagger for each language used. Part of speech tagging is done by first accumulating a massive corpus of pre-tagged text. With this information, the tool trains a part of speech tagger and relies on probabilities to determine the correct part of speech for a given word in a given context. In order to achieve the accuracy of sentiment, a very finely tuned and well trained par speech tagger is required and nowadays many open source tools are doing that job.

Step 2: Lemmatization

The next step is to lemmatize each word where applicable. Lemmatization is the process of determining the root of a word and it must be language specific. It uses the rules of conjugating nouns and verbs based on number, gender, tense etc. The Lemmatization varies wildly from language to language and we cannot expect that users will give their feedback only in English.

Step 3: Prior Polarity

There are words which even without any surrounding context, immediately specify the sentiment.?“love”, “hate”, “despise” are example for this indicate polarizing. Sentiment analysis requires an exhaustive list of terms that can be clustered with prior polarity.

Step 4: Wrapping it all up using machine learning

We will be using machine learning to calculate a sentiment score that combines the presence of terms with prior polarity, as well as the length of the text. As an example – If I want to measure the happiness of my customer with three categories (Happy, Neutral, Unhappy) then, Shorter text which has a high ratio of polarizing terms to non-polarizing terms will be considered to a score closer to -1 (true negative) and 1 (true positive). A score of 0 or very close to 0 (±0.05) can be marked as “neutral”.

I would rather like to propose a BOT which will interact with the visitor upon the complaint, enquiry and feedback. The API will be built on these matrices. A high level architecture is suggested for reference.

Sumanto Mukherjee, Practice Lead

要查看或添加评论,请登录

Alchemy Techsol的更多文章

社区洞察

其他会员也浏览了