A tiny new open-source AI model performs as well as powerful big ones
Sarah Rogers/MITTR | Photos Getty

A tiny new open-source AI model performs as well as powerful big ones

The Allen Institute for Artificial Intelligence (Ai2), a research nonprofit, is releasing a family of open-source multimodal language models, called Molmo. The organization says they perform as well as top proprietary models from OpenAI, Google, and Anthropic… despite having many, many fewer parameters. In this edition of What’s Next in Tech, learn more about the tiny AI model that Ai2 says outperforms OpenAI’s GPT-4o.

Unlock access to unparalleled reporting on the tech topics that matter most to you today with a subscription to MIT Technology Review. Subscribe today to take your industry knowledge to the next level.

This new model suggests that training AI on less, but higher-quality, data can lower computing costs.

Ai2 claims that its biggest Molmo model, which has 72 billion parameters, outperforms OpenAI’s GPT-4o, which is estimated to have over a trillion parameters, in tests that measure things like understanding images, charts, and documents.??

Meanwhile, the organization says a smaller Molmo model, with 7 billion parameters, comes close to OpenAI’s state-of-the-art model in performance, an achievement it ascribes to vastly more efficient data collection and training methods.?

What Molmo shows is that open-source AI development is now on par with closed, proprietary models, says Ali Farhadi, the CEO of Ai2. And open-source models have a significant advantage, as their open nature means other people can build applications on top of them.

Other large multimodal language models are trained on vast data sets containing billions of images and text samples that have been hoovered from the internet, and they can include several trillion parameters. This process introduces a lot of noise to the training data and, with it, hallucinations, says Ani Kembhavi, a senior director of research at Ai2. In contrast, Ai2’s Molmo models have been trained on a significantly smaller and more curated data set containing only 600,000 images, and they have between 1 billion and 72 billion parameters.?

Read the story to learn more about Ai2’s process for training Molmo, what their models are capable of, and what others in the industry think about open-source AI.?

Artificial intelligence, demystified. Sign up for The Algorithm, our free weekly AI newsletter, today.

Get ahead with these related stories:

  1. Why OpenAI’s new model is such a big deal The bulk of LLM progress until now has been language-driven. This new model enters the realm of complex reasoning, with implications for physics, coding, and more.
  2. Chatbots can persuade people to stop believing in conspiracy theories AI is skilled at tapping into vast realms of data and tailoring it to a specific purpose, making it a highly customizable tool for combating misinformation.
  3. Why we need an AI safety hotline Existing measures to mitigate AI risks aren’t enough to protect us. Here’s what we need to do as well.

Image: Sarah Rogers/MITTR | Photos Getty


Tom Jones

Security hardware and software architect

1 个月

This new fight for the top billing for AI sounds like the US News ranking of colleges. No one ever aggress with results. The big question is the amount of data absorbed. It's like the argument about how much college is sufficient to perform particular jobs. That is now being pushed down to less college for many existing jobs. There is no single answer.

回复
Bruce Canedy

Global Healthcare Strategy Director \\ Software Development, Company Integrations, Flow Blockchain Development

1 个月

It's impressive that a smaller model, with only 7 billion parameters, can compete with larger LLMs. I appreciate that it's trained on curated data, but I wonder if that could create bias—what safeguards are in place? While fewer parameters might mean fewer errors, does that also limit the model's potential for insights? Energy efficiency is a plus, especially for smaller devices. Would have loved to review the full article.

回复
James G?rgen

Coordenador de Mercados Digitais - Coordinator for Digital Markets - SDIC/MDIC

1 个月
回复
Petr Adamek

Unlocking Your Business Potential with My Expertise | Digital Transformation | Management Consulting | User Research @AP Consulting

1 个月

I would take healthy scepticism. Self declared I am on par with top tier models is not enough.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了