Sentiment Analysis using Bi-Directional LSTM

Sentiment Analysis using Bi-Directional LSTM

As I mentioned in my previous article Sentiment Analysis using Deep Learning (1-D CNN), here is the post towards performing Sentiment Analysis on the same data using bidirectional LSTM which is a form of Recurrent Neural Network (RNN).

Why RNN: RNNs are designed to make use of sequential data, when the current step has some kind of relation with the previous steps. This makes them ideal for applications with a time component (audio, time-series data) and natural language processing. RNN’s perform very well for applications where sequential information is clearly important, because the meaning could be misinterpreted or the grammar could be incorrect if sequential information is not used. Applications include image captioning, language modeling and machine translation.

Why LSTM: Long Short-Term Memory (LSTM) stores historical information by constructing a memory unit, each temporal state saves the previous input information, which can effectively alleviate the long-distance dependence problem of Recurrent Neural Networks (RNN).

Why Bi-LSTM: LSTM ignores future information. The BiLSTM contributes to the solution of obtaining both historical information and future information by using the bidirectional propagation mechanism, which helps to achieve better performance in such tasks.

Note: I executed Bi-LSTM on Windows 10 machine on i7 CPU and it took around 8 hours for the training to complete (1 hour per epoch on an average). I highly recommend using GPU if you want to save time. This snipet shows the time and resources it took to train the model -

No alt text provided for this image

For details around Input data, Callback Functions, refer my previous article on Sentiment Analysis: Sentiment Analysis using Deep Learning (1-D CNN)

Once again, I have used Tensorboard to monitor training and validation parameters.

Tensorboard

Tensorboard provides an excellent way to visualize various metrices generated during model training and validation. Some of the metrices that were immensely useful –

·    Tracking and visualizing metrics such as loss and accuracy

·    Visualizing the model graph (ops and layers)

·    Viewing histograms of weights, biases, or other tensors as they change over time

It can easily be invoked from Windows 10 machine through Jupyter notebook with minimal efforts, so I thought of leveraging it to monitor some metrics. Here are couple of samples –

No alt text provided for this image
No alt text provided for this image

Here are some sample output predictions generated by Bi-LSTM model on IMDB reviews –

No alt text provided for this image

Further Enhancements

There are number of improvements that can be made to this model including (and not limited to)-

·     Improving Word Embedding by increasing dimensions of Word Embedding from 50 to 100 or more

·     Leveraging existing pre-trained word embedding (Google News dataset (about 100 billion words))

·     Training the model on a larger data-set. Probably use of Adversarial network for generating large amount of training data for Model training.

·     Currently Model is trained on only first 500 words of each review/comment. This can be increased to 1000 words

·     Leveraging Transformer-Based Models.

Here is the link to the Github repository for full code and test data: https://github.com/Ankit-DA/Sentiment_Analysis_Deep_Learning

要查看或添加评论,请登录

Ankit Agarwal的更多文章

社区洞察

其他会员也浏览了