A tiny new open-source AI model performs as well as powerful big ones
MIT Technology Review
Our in-depth reporting on innovation reveals and explains what’s happening now to help you know what’s coming next.
The Allen Institute for Artificial Intelligence (Ai2), a research nonprofit, is releasing a family of open-source multimodal language models, called Molmo. The organization says they perform as well as top proprietary models from OpenAI, Google, and Anthropic… despite having many, many fewer parameters. In this edition of What’s Next in Tech, learn more about the tiny AI model that Ai2 says outperforms OpenAI’s GPT-4o.
Unlock access to unparalleled reporting on the tech topics that matter most to you today with a subscription to MIT Technology Review. Subscribe today to take your industry knowledge to the next level.
This new model suggests that training AI on less, but higher-quality, data can lower computing costs.
Ai2 claims that its biggest Molmo model, which has 72 billion parameters, outperforms OpenAI’s GPT-4o, which is estimated to have over a trillion parameters, in tests that measure things like understanding images, charts, and documents.??
Meanwhile, the organization says a smaller Molmo model, with 7 billion parameters, comes close to OpenAI’s state-of-the-art model in performance, an achievement it ascribes to vastly more efficient data collection and training methods.?
What Molmo shows is that open-source AI development is now on par with closed, proprietary models, says Ali Farhadi, the CEO of Ai2. And open-source models have a significant advantage, as their open nature means other people can build applications on top of them.
Other large multimodal language models are trained on vast data sets containing billions of images and text samples that have been hoovered from the internet, and they can include several trillion parameters. This process introduces a lot of noise to the training data and, with it, hallucinations, says Ani Kembhavi, a senior director of research at Ai2. In contrast, Ai2’s Molmo models have been trained on a significantly smaller and more curated data set containing only 600,000 images, and they have between 1 billion and 72 billion parameters.?
领英推荐
Read the story to learn more about Ai2’s process for training Molmo, what their models are capable of, and what others in the industry think about open-source AI.?
Artificial intelligence, demystified. Sign up for The Algorithm, our free weekly AI newsletter, today.
Get ahead with these related stories:
Image: Sarah Rogers/MITTR | Photos Getty
Security hardware and software architect
1 个月This new fight for the top billing for AI sounds like the US News ranking of colleges. No one ever aggress with results. The big question is the amount of data absorbed. It's like the argument about how much college is sufficient to perform particular jobs. That is now being pushed down to less college for many existing jobs. There is no single answer.
Global Healthcare Strategy Director \\ Software Development, Company Integrations, Flow Blockchain Development
1 个月It's impressive that a smaller model, with only 7 billion parameters, can compete with larger LLMs. I appreciate that it's trained on curated data, but I wonder if that could create bias—what safeguards are in place? While fewer parameters might mean fewer errors, does that also limit the model's potential for insights? Energy efficiency is a plus, especially for smaller devices. Would have loved to review the full article.
Coordenador de Mercados Digitais - Coordinator for Digital Markets - SDIC/MDIC
1 个月Alause Pires
Unlocking Your Business Potential with My Expertise | Digital Transformation | Management Consulting | User Research @AP Consulting
1 个月I would take healthy scepticism. Self declared I am on par with top tier models is not enough.