Bi-gram Model (Part 2)

Bi-gram Model (Part 2)

The last past described all the theories behind the bi-gram model. This article is all about the results of the model and the corresponding interpretation. The article explaining the text analysis of three of the books depicts the similarities in the top words in the training and testing dataset.

Results

The results of the bi-gram model have been presented below:

No alt text provided for this image

The results of the uniform model are ass follows:

No alt text provided for this image

The average log-likelihood of the bi-gram is lower than that of the unigram model. The simplicity of the model sometimes leads to a better result. A detailed representation of the results of the three models has been presented below.

No alt text provided for this image

It is very clear from the chart that the higher gram model has a lower log-likelihood than that of the naive one.

Impact of 's' on average log-likelihood

No alt text provided for this image

The results of this analysis differ a little from the unigram results. The average log-likelihood first increases till s = 3 and then starts decreasing.

Other Evaluation Metrics

There are other types of metrics which can be used for evaluation namely cross-entropy and perplexity. The former is the negative of log likelihood and the latter is exponential. These can be implemented very easily with minute changes in the evaluation metric formula.

With this I conclude the bi-gram models. I hope you enjoyed the article. Stay tuned for more!!

Link to the code:







要查看或添加评论,请登录

Jyoti Y.的更多文章

  • BERT (Part -3)

    BERT (Part -3)

    In the last two articles, I have described each element of the BERT model. This article combines all the concepts of…

  • BERT (Part-2)

    BERT (Part-2)

    The paper released by Google shows two architectures of BERT: Base: It is consisting of 12 encoder layers, 12 attention…

  • BERT (Part-1)

    BERT (Part-1)

    In 2019, Google released a breakthrough in the NLP domain. It has introduced the concept that has become the…

  • Attention Based Model (Part-2)

    Attention Based Model (Part-2)

    In the previous article, we studied the issues related to long sequences faced by an RNN architecture in the case of…

  • Attention Based Model (Part-1)

    Attention Based Model (Part-1)

    In the previous articles, we have gone through some of the text mining and preliminary methods for text analysis and…

  • Recurrent Neural Network (Part - 3)

    Recurrent Neural Network (Part - 3)

    For illustration purposes, we are using the airline's review dataset. The first and foremost is to filter the data…

  • Recurrent Neural Network (Part -2)

    Recurrent Neural Network (Part -2)

    This segment describes the backpropagation of the entire RNN model. Backpropagation is when the final loss is…

  • Recurrent Neural Network (Part-1)

    Recurrent Neural Network (Part-1)

    In the entire series of NLP, we have come across many techniques like TF-IDF, word2vec, BoW. These techniques are…

  • Latent Dirichlet Allocation (Part -3)

    Latent Dirichlet Allocation (Part -3)

    The theory and implementation of the model have been provided in the last two articles. This article is primarily about…

  • Latent Dirichlet Allocation (Part 2)

    Latent Dirichlet Allocation (Part 2)

    The theory behind the entire model has been described in the last article. This article puts light on the code part of…

社区洞察

其他会员也浏览了