课程: Introduction to Large Language Models
今天就学习课程吧!
今天就开通帐号,24,700 门业界名师课程任您挑!
Chinchilla
- [Instructor] Over the years, the trend has been to increase the model size. Although we won't look at any of these models in detail. I'll mention them briefly now because we'll be comparing them later. So Megatron-Turing was released by a collaboration between Microsoft and Nvidia in Jan of 2022 that had 530 billion parameters. The Google DeepMind team released details about Gopher, which had 280 billion parameters, and it was one of the best models out there at the time. You can see that the model sizes were getting very large, and this was because of the scaling laws. But what if the scaling laws didn't capture the entire picture? The DeepMind team's hypothesis was that large language models were significantly undertrained. You could get much better performance with the same computational budget by training a smaller model for longer. Now, the way you would try and test out a hypothesis is to do a whole lot of…
内容
-
-
-
-
-
(已锁定)
BERT3 分钟 16 秒
-
(已锁定)
Scaling laws3 分钟 30 秒
-
(已锁定)
GPT-37 分钟 41 秒
-
(已锁定)
Chinchilla7 分钟 54 秒
-
(已锁定)
PaLM and PaLM 23 分钟 59 秒
-
(已锁定)
ChatGPT and GPT-45 分钟 47 秒
-
(已锁定)
Open LLMs5 分钟 40 秒
-
(已锁定)
Comparing LLMs3 分钟 35 秒
-
(已锁定)
GitHub Models: Comparing LLMs2 分钟 52 秒
-
(已锁定)
Accessing large language models using an API6 分钟 25 秒
-
(已锁定)
LLM trends4 分钟 6 秒
-
(已锁定)
-