课程: Generative AI: Working with Large Language Models
今天就学习课程吧!
今天就开通帐号,24,700 门业界名师课程任您挑!
Self-attention
- [Instructor] One of the key ingredients to transformers is self-attention. In this example text, "The monkey ate that banana because it was too hungry." How is the model able to determine that the it corresponds to the monkey and not the banana? It does this using a mechanism called self-attention that incorporates the embeddings for all the other words in the sentence. So when processing the word it, self-attention will take a weighted average of the embeddings of the other context words. The darker the shade, the more weight that word is given and every word is given some weight. And you can see that both banana and monkey come up as likely for the word it. But monkey has the higher weighted average. So what's happening under the hood? As part of the self-attention mechanism, the authors of the original transformer take the word embeddings and project it into three vector spaces, which they called query, key, and…
随堂练习,边学边练
下载课堂讲义。学练结合,紧跟进度,轻松巩固知识。