Overview of Sentiment Analysis with LLM
Sentiment Analysis

Overview of Sentiment Analysis with LLM

This article explores overview of Sentiment Analysis, covering its definition and types. We'll start by examining what Sentiment Analysis is and its various types. Finally , we'll delve into Sentiment Analysis using Large Language Models.

Introduction

Sentiment analysis is also called as Opinion analysis or Opinion mining.Several real-world applications require sentiment analysis for detailed investigation. for example, product analysis, discover which components or qualities of a product appeal to customers in terms of product quality.

Sentiment analysis for various applications like reputation management, market research, and competitor analysis, product analysis, customer voice, etc. Various issues are associated with sentiment analysis and natural language processing, such as individuals informal writing style, sarcasm, irony, and language-specific challenges. There are many words in different languages whose meaning and orientation change depending on the context and domain in which they are employed. Therefore, there are not many tools and resources available for all the languages. Sarcasm and irony are two of the most critical challenges that have recently attracted the attention of researchers. There has been much development in detecting sarcasm and irony in text. There are many challenges in sentiment analysis.

Types of Sentiment Analysis

Sentiment analysis has been investigated on several levels: Document Level, Sentence Level, Phrase Level, and Aspect Level. Sentiment analysis in each level such as document, sentence and phrase, aspect level shown in Fig. 1.


Document level sentiment analysis

Document-level: Document level sentiment analysis is performed on a whole document, and single polarity is given to the whole document. This type of sentiment analysis is not used a lot. It can be used to classify chapters or pages of a book as positive, negative, or neutral. At this level, both supervised and unsupervised learning approaches can be utilized to classify the document

Sentence level sentiment analysis

Sentence level: In this level of analysis, each sentence is analyzed and finding with a corresponding polarity. This is highly useful when a document has a wide range and mix of sentiments associated with it (Yang and Cardie 2014). This classification level is associated with subjective classification (Rao et al. 2018). Each sentence polarity will be determined independently using the same methodologies as the document level but with greater training data and processing resources. The polarity of each sentence may be aggregated to find the sentiment of the document or used individually.

Phrase level sentiment analysis

Phrase level: Sentiment analysis also be performed where opinion words are mined at phrase level, and classification will be done. Each phrase may contain multiple aspects or single aspects. This may be useful product reviews of multiple lines; here, it is observed that a single aspect is expressed in a phrase.

Aspect level sentiment analysis

Aspect-level sentiment analysis takes words (and phrases) from text to identify specific aspects or features being discussed, and then it determines the sentiment (positive, negative, or neutral) associated with those aspects.

How it works:

Identify the aspects (features) from the text:

For example, in a sentence like "The screen is bright, but the sound quality is poor," the words "screen" and "sound quality" are recognized as aspects.

Determine the sentiment for each aspect:

The sentiment associated with "screen" is positive (because of the word "bright").

The sentiment associated with "sound quality" is negative (because of the word "poor").

In this way, aspect-level sentiment analysis goes beyond just detecting the overall tone of the text and identifies opinions related to specific words or aspects mentioned.

Feature Selection

To develop a classification model it requires first identifying relevant features in dataset

  1. Emoji are facial expressions used in sentiment analysis to convey emotions.
  2. Punctuation marks, or exclamation marks, serve to highlight the force of a positive or negative remark. Similarly, the apostrophe and the question mark are other punctuation marks.
  3. Words in slang, such as lol and rofl. These are frequently used to introduce a sense of humor into a remark.
  4. Punctuation marks, like exclamation marks, serve to highlight the force of a positive or negative remark. Similarly, the apostrophe and the question mark are other punctuation marks.

Feature Extraction

Feature extraction is a key task in sentiment classification as it involves the extraction of valuable information from the text data, and it will directly impact the performance of the model.

Negations These are the words that can change or reverse the polarity of the opinion and shift the meaning of a sentence. Commonly used negation words include not, cannot, neither, never, nowhere, none, etc. Every word appearing in the sentence will not reverse the polarity; therefore, removing all negation words from stop-words may increase the computational cost and decrease the model’s accuracy. Negation words must be handled with at most care (George et al. 2013). Negation words such as not, neither, nor, and so on are critical for sentiment analysis since they can revert the polarity of a given phrase. For instance, the line “This movie is good.” is a positive sentence, but “The movie is not good.” is a negative sentence. Regrettably, some systems eliminate negation words because they are included in stop word lists or are implicitly omitted since they have a neutral sentiment value in a lexicon and do not affect the absolute polarity. However, reversing the polarity is not straight forward because negation words might occur in a sentence without affecting the text’s emotion.

Bag of Words (BoW) BoW is one of the simplest approach for extracting text features. BoW will describe the occurrence of words in a document. Bag represents the vocabulary of words using which a vector is formed for each sentence. The main problem with this model is that it does not consider the syntactic meaning of the text. For instance, consider two sentences s1= “the food was good”, s2= “the service was bad”. The vocabulary is created for two sentences where v= {’the’, ‘food’, ‘was’, ‘service’, ‘bad’, ‘good’} and the length of the vector is 6 and is represented as v1= [ 1 1 1 0 0 1] and v2= [1 0 1 1 1 0]. BoW approach performance evaluated using (TF-IDF) which performs better in most cases.

Word Embedding

Word embeddings represent words in a vector space by clustering words with similar meanings together. Each word is assigned to a vector, which is then learned in a manner similar to neural networks. It learns and chooses a vector from a predetermined vocabulary. The dimension of the words may be chosen by passing it as a hyperparameter. SG model and the continuous CBOW model are two of the most well-known algorithms for word embeddings. Both of these are shallow window approaches methods in which a short window of some size, such as four or six, is specified, and the current word is anticipated using context words in CBOW, while context words are forecasted using the current word in the SG model. Word embeddings are concerned with learning about words in the context of their local usage, which is specified by a window of nearby terms.

Word2vec word2vec is a 2-layer neural network that is used for vectorizing the tokens. It is one of the famous and widely used vectorizing techniques developed by Mikolov et al. (2013). Word2vec mainly has two models CBOW and SG. The CBOW model predicts the target word using context words, whereas the SG model predicts the target word using context words. With a larger dataset, the SG model performs better. Global Vectors (GloVe) Global Vectors for word representation have developed (Pennington et al. 2014) by an unsupervised learning approach to generate word embeddings from a corpus word-to-word co-occurrence matrix. GloVe is a popularly used method as it is straightforward and quick to train GloVe model because of its parallel implementation capacity (Al Amrani et al. 2018).

Fast Text It is an open-source and free library developed by FAIR (Facebook AI Research) mainly used for word classifications, vectorization, and creation of word embeddings. It uses a linear classifier to train the model, which is very fast in training the model (Bojanowski et al. 2017). It supports a CBOW and SG model. Semantic similarities may be found using this model.

ELMo ELMo is a deep contextualized text representation. ELMo contributes to overcoming the limitations of conventional word embedding approaches such as LSA, TF-IDF and n-grams models (Peng et al. 2019). ELMo generates embeddings to words based on the contexts in which they are used to record the word meaning and retrieve additional contextual information. Through pretraining, ELMo can more accurately represent polysemous words in a variety of contexts and is more informative about the text’s higher-level semantics (Ling et al. 2020).

Task of Sentiment Analysis

Overview of the tasks various tasks of sentiment analysis is shown in the figure 2.


  1. Subjectivity Classification is a Natural Language Processing (NLP) task that aims to classify a piece of text as either subjective or objective. The goal is to determine whether a text expresses personal opinions, feelings, and beliefs (subjective) or factual, neutral information (objective).Key Concepts: Subjective Text Definition: Subjective text expresses personal opinions, judgments, emotions, or viewpoints. It is non-factual and influenced by individual perceptions or feelings. Example: "The camera quality is amazing" is subjective because it conveys an opinion or personal experience.Objective Text Definition: Objective text provides factual information, usually neutral, verifiable, and not influenced by personal feelings or opinions. Example: "The camera has a 12-megapixel sensor" is objective because it presents a measurable fact.
  2. Sentiment Classification is a Natural Language Processing (NLP) task that involves determining the emotional tone or sentiment expressed in a piece of text. It classifies the text into predefined categories of sentiment, such as positive, negative, or neutral. This classification is often used in applications like product reviews, social media analysis, and customer feedback to understand users' opinions or emotions about a particular subject.
  3. Opinion Spam Detection is the process of identifying and filtering out deceptive, fake, or manipulative opinions (e.g., product reviews, ratings, comments) that are intended to mislead potential consumers or distort the reputation of a product, service, or organization. These "spam" opinions are crafted to either promote or demote an item artificially and are a growing issue, especially in platforms like Amazon, Yelp, TripAdvisor, and social media.
  4. Implicit Language Detection: Sarcasm, irony, and humor are generally referred to as Implicit Languages.
  5. Aspect Extraction is a key task in Natural Language Processing (NLP) that focuses on identifying specific components, features, or attributes of a product or service mentioned in text. This process is crucial for understanding opinions expressed in reviews, feedback, or discussions, particularly in the context of aspect-based sentiment analysis.Key Concepts:

Methodology

Three mainly used approaches for Sentiment Analysis include Lexicon Based Approach, Machine Learning Approach, and Hybrid Approach, deep learning. In addition, researchers are continuously trying to figure out better ways to accomplish the task with better accuracy and lower computational cost.

Neutal Network

RNN (Donkers et? al. 2017) have proven to improve results when trained on sufficient data and computations. Variants of RNN (Pham and Le-Hong 2017) like LSTM (Bandara et? al. 2020), GRU (Cheng et? al. 2020), Bi-LSTM (Abid et? al. 2019; Cho and Lee 2019) have been used extensively in Sentiment analysis and related NLP task (Abid et?al. 2019; Khan et?al. 2016). Attention models are being introduced recently, which gives models an edge over another model. Recent transfer learning techniques using BERT (Devlin et? al. 2018) and GPT (Ethayarajh 2019) are gaining the attention of researchers as the model is already trained on a massive corpus for days on high-end GPU and Super computers. Weights can be fine-tuned using the training dataset to get accurate results. Deep learning-based techniques are becoming highly popular due to their outstanding performance in recent times.

Aspect Based Sentiment Analysis

ASBA is? valuable and rapidly growing part of sentiment analysis that has gained prominence in recent years. Three critical phases compose aspect-level sentiment analysis: aspect detection, polarity or sentiment categorization, and aggregation.

Aspect level sentiment analysis is most popular among product reviews or hotel reviews, as this approach will help them identify various aspects focused by the review writers and help them rectify aspects that have a negative sentiment.

Complex algorithms like LSTM, Bi-LSTM or pre-trained models like BERT, GPT-2 may be used to accomplish the task. The researchers avoid vanilla RNN as it faces many problems like vanishing and exploding gradient descent.

Transfer Learning

Transfer learning is one of the advances techniques in AI, where a pre-trained model can use its acquired knowledge to transfer to a new model. Transfer learning uses the similarity of data, distribution, and task. The new model directly uses the previously learned features without needing any explicit training data. Training data may be used to fine-tune to the model to a new task.

In 2018, Google AI Language Researchers open-sourced a new model for NLP called BERT. It has a breakthrough and has taken the industry of deep learning by storm due to its performance. In the work of Han et?al. (2021) Transformer network revolutionized the area of NLP and replaced the usage of LSTM and Bi-LSTM. The main advantage is that Transformers do not suffer from vanishing or exploding gradient problems as they do not use recurrence at all, and also, they are faster and less expensive to train. BERT is an extension of the Transformers model proposed (Vaswani et?al. 2017) in the “Attention is all you need” paper. BERT uses transformers, an attention mechanism that learns contextual relationships between words or sub-words in a given text. The input in this model contains the word embeddings and position embeddings, unlike transformers, but also has an extra vector representing the sentence it belongs to handle two or more sentences at a time. BERT consists of encoders based?transformers; the encoder part is similar to the transformer encoder. BERT has two models BERT base with 12 encoders stacked with 110 million parameters and BERT large model with 24 encoders stacked with 330 million parameters. BERT model trained in two stages pre-training and fine-tuning. This is the model main advantage as the fine-tuning with the dataset can be done as per the task.

Large Language Models

  • LLMs' Performance on Text Length:LLMs tend to perform better on longer texts using zero-short training due to thier pre-training on vast datasets.However, performance may decline with shorter texts or informal language, suggesting zero-shot training alone is insufficient for all tasks. While training an LLM on domain-specific tasks can yield better results, it's often too costly and resource-intensive. Comparatively, we can achieve similar or better results using aspect-based sentiment analysis on movie reviews.
  • Challenges with Informal Language:Slang, sarcasm, and punctuation pose significant challenges for LLMs.Movies are inherently subjective, and reviews often employ metaphors, sarcasm, etc. This complexity could pose challenges for LLMs
  • Zero-shot capability is promising but not a universal solution:LLMs can analyze sentiment without specific training data: This zero-shot capability is advantageous when labeled review datasets are scarce.Performance may lag behind fine-tuned models: While convenient, zero-shot LLM performance might not always match the accuracy of models specifically trained on labeled review data, especially in specialized domains
  • Aspect based Sentiment Analysis in LLM hasn’t been reserched yet properly, whether zero-short LLM performs better or worst is still unclear.

When to use LLM for sentiment analysis

Considerations for choosing LLMs for review analysis:

  • Nature of the reviews: Consider the source (social media vs. dedicated review platforms), length, and domain specificity.
  • Need for aspect-based analysis: Assess if understanding sentiment towards specific aspects is necessary.
  • Availability of labeled data: If labeled review data is scarce, zero-shot or few-shot LLM approaches might be suitable, while ample data could favor fine-tuned models.
  • Importance of explainability: If understanding the reasoning behind sentiment predictions is crucial, LLMs' explainability features are advantageous.

The resource I used to write this article is

  • Wankhade, M., Rao, A.C.S. and Kulkarni, C., 2022. A survey on sentiment analysis methods, applications, and challenges.?Artificial Intelligence Review,?55(7), pp.5731-5780.

要查看或添加评论,请登录

Kavach Dheer的更多文章

社区洞察

其他会员也浏览了