课程: Introduction to Large Language Models
今天就学习课程吧!
今天就开通帐号,24,700 门业界名师课程任您挑!
Comparing LLMs
- [Instructor] How do we even compare large language models? That's a great question. I don't think we have a perfect answer yet, but we have made great progress over the last few months. Usually, we only focus on how good a model is at a task, but we don't know if that same model generates false information. So instead of just looking at one metric, a Stanford University research team proposed HELM, or the Holistic Evaluation of Language Models. With HELM, the Stanford research team worked together with the main large language model providers, and they were able to benchmark the models across a variety of data sets and get a more holistic view of model performance. The HELM benchmark is a living benchmark and should change as new models are released. I'll just cover the first couple of benchmarks and you can explore the rest further if you're interested. So let me just go ahead and scroll down a little bit. So here, each row…
内容
-
-
-
-
-
(已锁定)
BERT3 分钟 16 秒
-
(已锁定)
Scaling laws3 分钟 30 秒
-
(已锁定)
GPT-37 分钟 41 秒
-
(已锁定)
Chinchilla7 分钟 54 秒
-
(已锁定)
PaLM and PaLM 23 分钟 59 秒
-
(已锁定)
ChatGPT and GPT-45 分钟 47 秒
-
(已锁定)
Open LLMs5 分钟 40 秒
-
(已锁定)
Comparing LLMs3 分钟 35 秒
-
(已锁定)
GitHub Models: Comparing LLMs2 分钟 52 秒
-
(已锁定)
Accessing large language models using an API6 分钟 25 秒
-
(已锁定)
LLM trends4 分钟 6 秒
-
(已锁定)
-