课程: TensorFlow: Working with NLP

今天就学习课程吧!

今天就开通帐号,24,700 门业界名师课程任您挑!

Self-attention

Self-attention

- [Instructor] In this example text, the monkey ate that banana because it was too hungry, how is the model able to determine that the it corresponds to the monkey and not to the banana? It does this using a mechanism called self-attention that incorporates the embeddings for all the other words in the sentence. So when processing the word it, self-attention will take a weighted average of the embeddings of the other context words. The darker the shade, the more weight that word is given, and every word is given some weight. You can see that both banana and monkey come up as likely for the word it but monkey has the higher weighted average. What's happening under the hood? As part of the self-attention mechanism, the authors of the original transformer take the word embeddings and project it into three vector spaces, which they called query, key, and value. Why project the word embeddings into these three new vector…

内容