登录查看更多内容

Transformers

Ishika Garg

Consultant - AI/ML Developer | Genpact

发布日期: 2024年6月7日

We’re exploring the realm of Deep Learning, focusing on the pivotal role that “transformers” play in driving advancements in AI, rather than referring to the fictional robots of cinema fame.

Transformer was first proposed in a 2017 paper called “Attention is All You Need” by researchers at Google and the University of Toronto.

Transformers employ semi-supervised learning; they are pre-trained in an unsupervised manner with large, unlabeled datasets, and then fine-tuned through supervised training to enhance their performance. Furthermore, Transformers execute multiple sequences in parallel, significantly expediting the training process.

Examples include language translation, document summarization, and auto-completion tasks.

What sets transformers apart from other models?

Attention Mechanism — In transformers, the attention mechanism computes attention scores between each pair of tokens in the input sequence. These attention scores determine how much focus should be given to each token when processing a particular token. For e.g. — In the sentence “the animal didn’t cross the street because it was too tired” the attention mechanism would assign higher weights to “animal” when processing “it,” as it refers to animal.

2. Positional Encoding — Positional encoding is a crucial component of transformers that provides information about the position of words or tokens within a sequence. Since transformers process input sequences in parallel, they lack the inherent understanding of the sequential order of tokens that RNNs possess. Positional encoding addresses this limitation by injecting positional information into the input embeddings. This allows the transformer model to differentiate between tokens based on their positions within the sequence.

3. Parallel Process — Transformers process the entire input sequence in parallel, enabling faster training and inference, especially for long sequences.

Bernard Marr 6 年前

Business Applications of Deep Learning and LLM's

Ganapathy (Krish) Krishnan 11 个月前

The difference between Deep Learning & Reinforcement…

Amandeep - CCISO, CISSP, CISA, CRISC, CDPSE, PMP 2 个月前

It consists of 2 parts

Encoder — The encoder layer is responsible for capturing the input data and transforming it into a fixed-dimensional representation called context vector. According to the research paper, the encoder is composed of stack of N=6 identical layers. Each layer has 2 sub-layers: Self attention and feed Forward.
Decoder — The Decoder layer takes the context representation generated by the encoder layer to generate the output sequence one element at a time. The decoder is also composed of a stack of N=6 identical layers. Each decoder layer has 3 sub-layers: Self attention, encoder-decoder attention, and Feed Forward.

Pre-Models Transformers models are –

Bidirectional Encoder Representations from Transformer (BERT) — BERT utilized the encoder part of the Transformer.
Generative Pre-trained Transformer (GPT) — GPT uses only the decoder part of the Transformer processing the text in a unidirectional manner from left to right. Language Generative Tasks are performed here.

References —

https://www.youtube.com/watch?v=SMZQrJ_L1vo
https://jalammar.github.io/illustrated-transformer/

Finally

Hopefully, you enjoyed reading it. Buckle up, because our next blog is gonna be EPIC!

Got questions? Don’t be shy! Hit me up on LinkedIn. Coffee’s on me (virtually, of course) ??

A J.

Senior Test Engineer - Khoros | Application Support | Ex - DXC Technology | Ex- Hexaware Technologies Limited

5 个月

Hi Ishika I need a referral from your end

1 次回应

M Sharana Basava

Immediate Joiner | Embedded engineer | I completed 10 months of Hands-on Technical Training Program Emertxe @ Bangalore | I have 1 years of experience in customer support role

5 个月

I agree! #i have 1.6 year experience can i get a referral in your company #immediate joiner

1 次回应

Sonu Kumar

5 个月

Ishika Garg Hii I am Looking for a Job in Account Profile in any Reputed MNC Company and I just Put the MNC Tag on Myself that's why I wanna be the part of MNC Company like Genpect could you please give me any leads to Join. Thank You.

1 次回应

K Vijaya Kumar

Assistant Manager Operations l US Healthcare l Immediate Joiner l Operations Management l Team management

5 个月

Hi Ishika

Dhanushant Bishnoi

5 个月

Thats a good point to explain , Just made me curious even after analysing our sentence how it fetch results , as i think they dont use search engines to fetch results

2 次回应

查看更多评论

要查看或添加评论，请登录

Ishika Garg的更多文章

Linear Regression

2024年8月16日

Linear Regression

Today, we’re diving into the math behind one of the most fundamental models in machine learning: linear regression…

12 条评论
RAG

2024年7月18日

RAG

RAG stands for Retrieval-Augmented Generation. It’s a game-changer when working with LLMs.

6 条评论
Vector Database

2024年7月4日

Vector Database

In the world of databases, we’re all familiar with traditional databases like RDBMS. But have you heard about vector…

9 条评论
LLM Models

2024年5月31日

LLM Models

LLMs are a category of foundation models trained on large amounts of data (such as books, articles, etc.), enabling…

14 条评论
Foundation Model

2024年5月23日

Foundation Model

FOUNDATION MODEL is a versatile machine learning model that has been pre-trained on a vast amount of unlabelled, and…

6 条评论

See all articles

Transformers

Ishika Garg

Consultant - AI/ML Developer | Genpact

领英推荐

Ishika Garg的更多文章

社区洞察

其他会员也浏览了

What is the difference between Artificial Intelligence, Machine Learning, Active Learning, and Deep Learning?

Future of Deep Learning _ Where are we heading towards

Making AI Less Hungry: The Race for Efficient Deep Learning Algorithms

Future of Deep Learning _ Where are we heading towards

Demystifying AI for Non-Technical Professionals

Introduction to Transformers and Attention Mechanisms

Introduction to Transformers and Attention Mechanisms

Future of Deep Learning _ Where are we heading towards

A New Era of Intelligence: The Dawn of a Magical Future

Future of Deep Learning _ Where are we heading towards

领英推荐

Ishika Garg的更多文章

Linear Regression

RAG

Vector Database

LLM Models

Foundation Model

社区洞察

其他会员也浏览了

What is the difference between Artificial Intelligence, Machine Learning, Active Learning, and Deep Learning?

Future of Deep Learning _ Where are we heading towards

Making AI Less Hungry: The Race for Efficient Deep Learning Algorithms

Future of Deep Learning _ Where are we heading towards

Demystifying AI for Non-Technical Professionals

Introduction to Transformers and Attention Mechanisms

Introduction to Transformers and Attention Mechanisms

Future of Deep Learning _ Where are we heading towards

A New Era of Intelligence: The Dawn of a Magical Future

Future of Deep Learning _ Where are we heading towards