课程: Generative AI: Working with Large Language Models

今天就学习课程吧!

今天就开通帐号,24,700 门业界名师课程任您挑!

Going further with Transformers

Going further with Transformers

- [Jonathan] We've covered a ton of material in this course. We've looked at many of the large language models since GPT-3. Let's review them really quickly. We saw how Google reduced training and inference costs by using sparse mixtures of expert models with GLaM. A month later, Microsoft teamed up with Nvidia to create the Megatron Turing LG model that was three times larger than GPT-3 with 530 billion parameters. In the same month, the DeepMind team released Gofer and their largest 280 billion parameter model which was their best performing model. A few months later, the DeepMind team introduced Chinchilla, which turned a lot of our understanding of large language models on its head. The main takeaway was that large language models up to this point had been undertrained. Google released the 540 billion parameter modeled PaLM in April training it on their Pathways infrastructure, and this has been the best performing…

内容