登录查看更多内容

Deep Learning for NLP Part-2

Niraj Kumar, Ph.D.

AI/ML R&D Leader | Driving Innovation in Generative AI, LLMs & Explainable AI | Strategic Visionary & Patent Innovator | Bridging AI Research with Business Impact

发布日期: 2019年10月12日

Sequence transduction plays a very important role in natural language processing. The ability to transform and manipulate sequences from one type to another type is a crucial part of human intelligence. These days attention-based mechanism supports different types of sequence transduction like (including but not limited to): Sequence to sequence mapping, machine translation, text to speech, speech to text, text to selective summary generation, protein secondary structure prediction and so on. The development of attention-based mechanism has improved the bottlenecks of traditional encoder-decoder architectures. To achieve this, we used to put attention layer(s) between encoder and decoder layers. The way it makes changes, can be used to define it also –

Attention Definition: Given a set of vector values, and a vector query, attention is a technique to compute a weighted sum of the values, dependent on the query.

Progress in attention mechanism.

If we see the progress in attention based mechanisms, we find that most of the scientific literature just consider one-two fixed architectural places for modifications. This will clear from the following steps.

Steps to apply attention in sequence transduction.

NOTE: The transformer model [1] uses the scaled dot product based attention. In the next article, I will try to cover BERT and XLNet.

Tutorials on Attention Based Models and Transformer Model for NLP

Reference.

Vaswani, Ashish, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, ?ukasz Kaiser, and Illia Polosukhin. "Attention is all you need." In Advances in neural information processing systems, pp. 5998-6008. 2017.
Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. "Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015).
Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014.
Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., ... & Klingner, J. (2016). Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144.

要查看或添加评论，请登录

Niraj Kumar, Ph.D.的更多文章

Internal Covariate Shift and Batch Normalization

2023年3月25日

Internal Covariate Shift and Batch Normalization

Internal Covariate Shift Internal covariate shift [1,2,3] refers to the phenomenon where the distribution of inputs to…
Forced/Guided Learning in Deep Learning

2023年3月11日

Forced/Guided Learning in Deep Learning

The forced/guided type deep learning techniques have proven their ability in any model that outputs in sequences. For…
Deep Clustering (A Self-Supervised Learning System)

2023年2月18日

Deep Clustering (A Self-Supervised Learning System)

If you are interested in any of the following, How do I develop a deep learning model, that can learn to do clustering?…
Time to Welcome - “The Quantum Deep Learning”

2023年1月21日

Time to Welcome - “The Quantum Deep Learning”

The Quantum World is Approaching Us The MIT xPRO - Quantum Computer Ai, highlighted the status of quantum AI by using…
Deep Learning for Dynamic Graph

2022年4月30日

Deep Learning for Dynamic Graph

Introduction. It is well understood that adding the time dimension to each and every component of the graph helps us in…
Winning Ensemble Classification Strategies

2020年6月6日

Winning Ensemble Classification Strategies

These days (1) due to the increase in the complexity of data, (2) data quality-related issues, and (2) the demand for…
Simplest Tutorials on BERT and XLNet

2020年1月25日

Simplest Tutorials on BERT and XLNet

XLNet XLNet: is a generalized autoregressive pre-training method that (1) enables learning bidirectional contexts by…
Video Book on Deep Learning

2019年12月13日

Video Book on Deep Learning

I am happy to present a video book on deep learning. Thanks for all the email messages and suggestions.

3 条评论
Loss Functions: Cross-Entropy, Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss

2019年1月22日

Loss Functions: Cross-Entropy, Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss

The following contains tutorial videos on (1) Cross-Entropy, (2) Categorical Cross-Entropy Loss, and (3) Binary…
Probabilistic graphical models for Deep Learning Part-1 (Restricted Boltzmann Machines)

2018年7月21日

Probabilistic graphical models for Deep Learning Part-1 (Restricted Boltzmann Machines)

RBM: Restricted Boltzmann machines are undirected graphical models that can also be interpreted as two-layered…

1 条评论

See all articles

Deep Learning for NLP Part-2

Niraj Kumar, Ph.D.

AI/ML R&D Leader | Driving Innovation in Generative AI, LLMs & Explainable AI | Strategic Visionary & Patent Innovator | Bridging AI Research with Business Impact

Progress in attention mechanism.

Steps to apply attention in sequence transduction.

Tutorials on Attention Based Models and Transformer Model for NLP

Reference.

Niraj Kumar, Ph.D.的更多文章

社区洞察

其他会员也浏览了

Transforming Internal Audit: The Role of Artificial Intelligence

APPLICATION OF NATURAL LANGUAGE PROCESSING (NLP) IN DEEP LEARNING:

A. Top AI tools to learn? B. Google. C. Microsoft. D. Amazon. E. Meta. F. Apple.

Flash Attention: Accelerating Deep Learning with Memory-Efficient Transformers

The Evolution and Impact of Natural Language Processing (NLP)

Transfer Learning in Large Language Models (LLMs)

AI 101: The building blocks and history of Artificial Intelligence

Streamlining requirements engineering & requirements documentation with NLP & AI

The Future of Language: How Deep Learning is Revolutionizing Natural Language Processing (NLP)

The Evolution of Natural Language Processing: From Text to Multimodal AI

Progress in attention mechanism.

Steps to apply attention in sequence transduction.

Tutorials on Attention Based Models and Transformer Model for NLP

Reference.

Niraj Kumar, Ph.D.的更多文章

Internal Covariate Shift and Batch Normalization

Forced/Guided Learning in Deep Learning

Deep Clustering (A Self-Supervised Learning System)

Time to Welcome - “The Quantum Deep Learning”

Deep Learning for Dynamic Graph

Winning Ensemble Classification Strategies

Simplest Tutorials on BERT and XLNet

Video Book on Deep Learning

Loss Functions: Cross-Entropy, Categorical Cross-Entropy Loss, Binary Cross-Entropy Loss

Probabilistic graphical models for Deep Learning Part-1 (Restricted Boltzmann Machines)

社区洞察

其他会员也浏览了

Transforming Internal Audit: The Role of Artificial Intelligence

APPLICATION OF NATURAL LANGUAGE PROCESSING (NLP) IN DEEP LEARNING:

A. Top AI tools to learn? B. Google. C. Microsoft. D. Amazon. E. Meta. F. Apple.

Flash Attention: Accelerating Deep Learning with Memory-Efficient Transformers

The Evolution and Impact of Natural Language Processing (NLP)

Transfer Learning in Large Language Models (LLMs)

AI 101: The building blocks and history of Artificial Intelligence

Streamlining requirements engineering & requirements documentation with NLP & AI

The Future of Language: How Deep Learning is Revolutionizing Natural Language Processing (NLP)

The Evolution of Natural Language Processing: From Text to Multimodal AI