N-grams

N-grams

N-gram is one of the most commonly used terms in the domain of NLP. The term has the potential of sounding very difficult but in reality, it is very easy to understand. It depicts the co-occurrence of words. N in “N-gram” is to depict the number of co-occurrences one wants to consider. N=1 is called “Unigram”, n=2 is termed as “bigram”, n=3 is “trigram” and beyond that, it's 4-gram, 5-gram, etc.

The example of unigram, bigram and trigram is presented below. The unigram is the list of single words, same as tokenizing the text. A bigram is the list tog two co-occurring words in the sentence.

No alt text provided for this image

Application

While writing an email, Gmail provides the suggestions for next word. N-gram is one way to do that. Let’s look at one example.

No alt text provided for this image

Consider bigrams, as described above it is the list of two co-occurrences words. Based on these bigrams it predicts the next word based on the last word which has been written.

Probability aspect of n-grams

Consider the following three sentences:

1.      Thanks for your patience.

2.      I liked your watch.

3.      Resume for your reference.

Based on the above training data I want to generate suggestions for the user and fill suggest word once he writes “your“ in a sentence. If we apply bigram here, the probability of completing the sentence with the word “patience” as the suggestion is 1/3 as it is occurring one time out of total three occurrences of the word “your”. The same holds for “watch” and “reference”.

Whereas, if we use a trigram, the last three words will be considered in training. So, during testing, it will predict a word once the person writes “for your”. This reduces the ambiguity and the newly updated probability for “patience” to be predicted is ? as “for your” occurs in a total of 2 trigrams. It will be making the right prediction in half of the same cases.

N-gram codes

1.      Unigram:

No alt text provided for this image

2.      Bigram:

No alt text provided for this image

3.      Trigram:

No alt text provided for this image

I hope you liked it. Stay tuned for more!


要查看或添加评论,请登录

Jyoti Y.的更多文章

  • BERT (Part -3)

    BERT (Part -3)

    In the last two articles, I have described each element of the BERT model. This article combines all the concepts of…

  • BERT (Part-2)

    BERT (Part-2)

    The paper released by Google shows two architectures of BERT: Base: It is consisting of 12 encoder layers, 12 attention…

  • BERT (Part-1)

    BERT (Part-1)

    In 2019, Google released a breakthrough in the NLP domain. It has introduced the concept that has become the…

  • Attention Based Model (Part-2)

    Attention Based Model (Part-2)

    In the previous article, we studied the issues related to long sequences faced by an RNN architecture in the case of…

  • Attention Based Model (Part-1)

    Attention Based Model (Part-1)

    In the previous articles, we have gone through some of the text mining and preliminary methods for text analysis and…

  • Recurrent Neural Network (Part - 3)

    Recurrent Neural Network (Part - 3)

    For illustration purposes, we are using the airline's review dataset. The first and foremost is to filter the data…

  • Recurrent Neural Network (Part -2)

    Recurrent Neural Network (Part -2)

    This segment describes the backpropagation of the entire RNN model. Backpropagation is when the final loss is…

  • Recurrent Neural Network (Part-1)

    Recurrent Neural Network (Part-1)

    In the entire series of NLP, we have come across many techniques like TF-IDF, word2vec, BoW. These techniques are…

  • Latent Dirichlet Allocation (Part -3)

    Latent Dirichlet Allocation (Part -3)

    The theory and implementation of the model have been provided in the last two articles. This article is primarily about…

  • Latent Dirichlet Allocation (Part 2)

    Latent Dirichlet Allocation (Part 2)

    The theory behind the entire model has been described in the last article. This article puts light on the code part of…

社区洞察

其他会员也浏览了