登录查看更多内容

A tiny new open-source AI model performs as well as powerful big ones

MIT Technology Review

Our in-depth reporting on innovation reveals and explains what’s happening now to help you know what’s coming next.

发布日期: 2024年10月1日

The Allen Institute for Artificial Intelligence (Ai2), a research nonprofit, is releasing a family of open-source multimodal language models, called Molmo. The organization says they perform as well as top proprietary models from OpenAI, Google, and Anthropic… despite having many, many fewer parameters. In this edition of What’s Next in Tech, learn more about the tiny AI model that Ai2 says outperforms OpenAI’s GPT-4o.

Unlock access to unparalleled reporting on the tech topics that matter most to you today with a subscription to MIT Technology Review. Subscribe today to take your industry knowledge to the next level.

This new model suggests that training AI on less, but higher-quality, data can lower computing costs.

Ai2 claims that its biggest Molmo model, which has 72 billion parameters, outperforms OpenAI’s GPT-4o, which is estimated to have over a trillion parameters, in tests that measure things like understanding images, charts, and documents.??

Meanwhile, the organization says a smaller Molmo model, with 7 billion parameters, comes close to OpenAI’s state-of-the-art model in performance, an achievement it ascribes to vastly more efficient data collection and training methods.?

What Molmo shows is that open-source AI development is now on par with closed, proprietary models, says Ali Farhadi, the CEO of Ai2. And open-source models have a significant advantage, as their open nature means other people can build applications on top of them.

Other large multimodal language models are trained on vast data sets containing billions of images and text samples that have been hoovered from the internet, and they can include several trillion parameters. This process introduces a lot of noise to the training data and, with it, hallucinations, says Ani Kembhavi, a senior director of research at Ai2. In contrast, Ai2’s Molmo models have been trained on a significantly smaller and more curated data set containing only 600,000 images, and they have between 1 billion and 72 billion parameters.?

领英推荐

AI News Roundup

Mohammad Arshad 11 个月前

Can GPTZero be relied upon for AI Detection accuracy?

Anna Y. 6 个月前

Multimodal Retrieval Augmented Generation…

Open Data Science Conference (ODSC) 8 个月前

Read the story to learn more about Ai2’s process for training Molmo, what their models are capable of, and what others in the industry think about open-source AI.?

Artificial intelligence, demystified. Sign up for The Algorithm, our free weekly AI newsletter, today.

Get ahead with these related stories:

Why OpenAI’s new model is such a big deal — The bulk of LLM progress until now has been language-driven. This new model enters the realm of complex reasoning, with implications for physics, coding, and more.
Chatbots can persuade people to stop believing in conspiracy theories — AI is skilled at tapping into vast realms of data and tailoring it to a specific purpose, making it a highly customizable tool for combating misinformation.
Why we need an AI safety hotline — Existing measures to mitigate AI risks aren’t enough to protect us. Here’s what we need to do as well.

Image: Sarah Rogers/MITTR | Photos Getty

What's Next in Tech

624,730 位关注者

Tom Jones

Security hardware and software architect

1 个月

This new fight for the top billing for AI sounds like the US News ranking of colleges. No one ever aggress with results. The big question is the amount of data absorbed. It's like the argument about how much college is sufficient to perform particular jobs. That is now being pushed down to less college for many existing jobs. There is no single answer.

Bruce Canedy

Global Healthcare Strategy Director \\ Software Development, Company Integrations, Flow Blockchain Development

1 个月

It's impressive that a smaller model, with only 7 billion parameters, can compete with larger LLMs. I appreciate that it's trained on curated data, but I wonder if that could create bias—what safeguards are in place? While fewer parameters might mean fewer errors, does that also limit the model's potential for insights? Energy efficiency is a plus, especially for smaller devices. Would have loved to review the full article.

James G?rgen

Coordenador de Mercados Digitais - Coordinator for Digital Markets - SDIC/MDIC

1 个月

Alause Pires

Petr Adamek

Unlocking Your Business Potential with My Expertise | Digital Transformation | Management Consulting | User Research @AP Consulting

1 个月

I would take healthy scepticism. Self declared I am on par with top tier models is not enough.

1 次回应

查看更多评论

要查看或添加评论，请登录

A tiny new open-source AI model performs as well as powerful big ones

MIT Technology Review

Our in-depth reporting on innovation reveals and explains what’s happening now to help you know what’s coming next.

This new model suggests that training AI on less, but higher-quality, data can lower computing costs.

领英推荐

Get ahead with these related stories:

What's Next in Tech

624,730 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

LLM-based Survey Autonomous Agents; Evaluating LLM on Graphs; Fine-Tune for GPT-3.5 and GPT-4; and More

ChatGPT vs Gemini; Uncertainty Quantification in GenAI; GPT-4 vs. GPT-4V vs. Humans On Abstraction and Reasoning; Private vs Public LLMs; and More.

Build Your Own AI Tool With Google Gemma

The Future of AI: Small Language Models, Small Agent Models, and Agent AI

Evolution of AI Language Models: A Comparative Analysis of GPT-3.5 and GPT-4

Insider's Edit: OpenAI's Tips for Writing Better Prompts

GenAI Weekly — Edition 8

Unlocking the Potential of Multi-modal Intelligence: Introducing Google's Gemini - Part 1

Unleashing the Power of Llama 2: Unveiling the Secrets of Generative AI with Snowflake! ????

What we know about Strawberry

This new model suggests that training AI on less, but higher-quality, data can lower computing costs.

领英推荐

Get ahead with these related stories:

What's Next in Tech

624,730 位关注者

Why AI could eat quantum computing’s lunch

2024年11月19日

Your introduction to all things AI

2024年11月12日

The surprising barrier that keeps us from building the housing we need

2024年11月5日

Palmer Luckey on the Pentagon’s future of mixed reality

2024年10月29日

The quest to figure out farming on Mars

2024年10月22日

Meet the radio-obsessed civilian shaping Ukraine’s drone defense

2024年10月15日

Introducing this year’s 15 Climate Tech Companies to Watch

2024年10月8日

There are more than 120 AI bills in Congress right now

2024年9月24日

Google says it’s made a quantum computing breakthrough that reduces errors

2024年9月17日

Introducing this year’s 35 Innovators Under 35

2024年9月10日

社区洞察

其他会员也浏览了

LLM-based Survey Autonomous Agents; Evaluating LLM on Graphs; Fine-Tune for GPT-3.5 and GPT-4; and More

ChatGPT vs Gemini; Uncertainty Quantification in GenAI; GPT-4 vs. GPT-4V vs. Humans On Abstraction and Reasoning; Private vs Public LLMs; and More.

Build Your Own AI Tool With Google Gemma

The Future of AI: Small Language Models, Small Agent Models, and Agent AI

Evolution of AI Language Models: A Comparative Analysis of GPT-3.5 and GPT-4

Insider's Edit: OpenAI's Tips for Writing Better Prompts

GenAI Weekly — Edition 8

Unlocking the Potential of Multi-modal Intelligence: Introducing Google's Gemini - Part 1

Unleashing the Power of Llama 2: Unveiling the Secrets of Generative AI with Snowflake! ????

What we know about Strawberry