Improving Predictions in Language Modelling
Here is something that I picked up along the way on how we can improve our predictions of LSTM networks, specifically regarding Language Modelling, i.e., Generating Text.
Here are some techniques that help LSTMs perform better at the prediction stage:
NOTE: These optimization techniques are not specific to LSTMs; rather, any sequential model can benefit from them.
Now, let's understand them in the context of Language Modelling,
Solutions:
For e.g.: Suppose we have a sentence 'Amit is learning Natural Language Processing'. Given the first word, 'Amit', and we want our LSTM network to predict the subsequent words.
If we attempt to choose samples deterministically, the LSTM might output something like the following: 'Amit is learning is Natural learning'
However, by sampling the next word from a subset of words in the vocabulary (most highly probable ones), the LSTM is forced to vary the prediction and might output the desired sentence with a respectable probability or something similar like: 'Amit is learning Processing Natural Language'.
However, although greedy sampling helps add more flavor/diversity to the generated text, this method does not guarantee that the output will always be realistic, especially when outputting longer text sequences.
Next comes:
领英推荐
The crucial idea of beam search is to produce the 'b' outputs simultaneously instead of a single output. We are looking farther into the future before making a prediction, which usually leads to better results.
Here, 'b' is known as the length of the beam, and the 'b' outputs produced are known as the beam.
Other variants of LSTMs include Peephole connections, GRUs, etc.
If you are interested in delving deeper into understanding these concepts, consider checking out my notebooks:
In one of the previous posts, I shared about neutral networks like RNNs, LSTMs, and GRUs, which are explicitly used for text data.
Here is a link to the post:
Manager @INR Pharmachem | BS (DS) @IITM'25
1 年I'm also working on NLP. let us discuss it sometime.