Three Things To Expect From This Week’s Llama 3 405B Release

Three Things To Expect From This Week’s Llama 3 405B Release

It’s a big week for AI as Meta releases the latest version of its Llama 3. With 405 billion parameters, it is the most powerful open-source model to date, able to answer questions and generate text and images. It’s also expected to achieve near parity with GPT-4, the model underpinning OpenAI’s ChatGPT.

First, it's a sea change for companies that want to build their custom AI models.

Until this release, if you wanted to create a small, specialized model, you had to rely on a larger, closed-source model with many restrictions as the source for training data. Now, companies will have access to a massive pre-trained model without the restrictions to use it for training. This will speed up the time it takes to build a specialized model, leading to an entirely new class of rich and robust custom models for use in an endless list of commercial applications. Everyone can now build their own specialized version of ChatGPT that is fine-tuned on your existing business data more easily than before.

Second, it may drive a change in AI architecture.

While an AI model with more than 400 billion parameters might seem sufficient for almost any situation, this version of Llama 3 will often be used in combination with other small- and medium-sized models in a multilayered approach. When needed, the smaller fine-tuned models will call upon the larger one to check their work and, if needed, correct any errors. We deliver this approach as part of a Composition of Experts, and we think it’s the most efficient way to run an AI system because it takes advantage of the best of both worlds: The breadth that comes from the large model plus the specialized depth of the smaller ones.?

Third, it will be a challenge to run.?

Cloud service providers offering the model will face a big challenge in how they deploy the model for their customers in an efficient and cost-effective way. Customers will want to shop around for platforms that can run efficiently without sacrificing accuracy.?

One big challenge will come from inferencing. Llama 3 405B is so large that traditional hardware platforms based on GPUs and other silicon struggle with inferencing. As enterprises pivot from training and fine-tuning their custom models to running production workloads like inferencing, they will look for alternatives to the GPU. They’ll find that SambaNova’s Reconfigurable Data Unit chip, which lies at the heart of our SambaNova Suite generative AI platform, is uniquely suited for production workloads. Then they’ll come talk to us.

Congratulations to the SambaNova Team!

Cristian Duque

Software Engineer & Developer | AI Engineering - Evaluation | AI Management

3 个月

Meta's open-source model is definitely leading the race in AI competition!

回复
Neil Johnson

Applying engineering and technology to solve interesting problems

7 个月

Wow incredible! But how far away are we from 1T parameter models? Is it exponential growth? Or are there key inflection points along the way?

回复

要查看或添加评论,请登录

SambaNova Systems的更多文章

社区洞察

其他会员也浏览了