登录查看更多内容

? Google's Gemini tells us a lot about the AI race

Azeem Azhar

Making sense of the Exponential Age

发布日期: 2023年12月11日

The release of ChatGPT set off the starting gun no one expected. As the NYT reports, when the Big Tech firms saw the reception of ChatGPT, they immediately pivoted to developing their own AI products, with minimal care for risks. Meta open-sourced Llama-2, Microsoft added GPT to their products and Google rush-released Bard. It seems like a blind sprint in a race no one fully understood.

Last week, the next leg was revealed with the announcement of Google’s Gemini models. They are multi-modal from the ground up, meaning it can reason and understand across modalities, such as text, image, audio and video. For an impressive example of this, see here how Gemini reads, understands and filters 200,000 scientific papers over a lunch break. The largest of the models, Gemini Ultra, looks to be the best model on the market, finally beating GPT-4. It has achieved state-of-the-art (SOTA) results on 30 of the 32 most common research benchmarks, the first model ever to outperform humans on the well-known MMLU benchmark.

But this doesn’t tell the full story. If there is anything telling about how much of an AI race there is, it’s in the marketing tricks that Google used to make its model better than it seems.?

领英推荐

Overcoming the AI plateau

VentureBeat 6 个月前

The cost of AI innovation is unpredictable. Here’s…

New Relic 3 个月前

Wait, Maybe We Should Regulate Data, and Not Companies

John Battelle 1 年前

Firstly, their MMLU SOTA score lacks nuance. It beats GPT-4 using Chain-of-Thought@32 prompting, a method that has the AI generate 32 different reasoning paths for a single question, considering various angles and possibilities, before choosing the most consistent or convincing answer. While this can lead to more nuanced and considered responses, it’s a process that’s more complex and less commonly employed for quick, everyday queries where users prioritise immediate and concise answers. On the other hand, GPT-4 beats Gemini Ultra using the 5-shot method, which involves presenting the AI with five examples of a task — complete with questions and the correct answers — to help it understand what’s expected before posing a new, similar question. This approach is likely closer to how users might naturally give context to help guide the AI towards the kind of response we’re seeking. This highlights some of the limitations of benchmarks, as we have previously covered in Chartpack: (Mis)measuring AI. The performance of LLMs is often inadequately represented in benchmarks due to their wide-ranging use cases, leading to abstraction from the actual qualia of using these models. We will have to wait for Gemini Ultra’s release to properly judge it.

Their second marketing trick was in their announcement video. They presented a video demo, which made Gemini look like some miraculous real-time assistant. However, it turns out the model response time was sped up, it was not done in real-time, and there was some complex prompting done on the backend. I can’t help but think this is a sign that Google feels threatened, and needs to create a better appearance than reality.?

This contrasts with OpenAI’s “low key research preview” that was ChatGPT’s release. In fairness, the game has changed since then. One thing is for certain, with the competition hotting up no tech firm can afford to sit still, whether or not we see models much better than Gemini Ultra or GPT-4 in the next few months.

See also my commentary from earlier this year:

Exponential View on LinkedIn

193,127 位关注者

Julieta Bustos Colín

Supervisor de RR. HH. en F &P MFG DE MEXICO SA DE CV

11 个月

@rer @eeeeerreeeeeeee9e99ee

Mykyta Basanko

COO at Incode Group // Business Advisor at MLPCo

11 个月

Great read!

Amy Zeitz Bailey

Realtor in Louisville Kentucky ? Finding the Land to Build Dream Homes ? Residential, Farms, Land, Acreage

11 个月

Comparing Horse and first model car analogy is perfect in insufferable resistance to intelligent progress. Suggest going to Henry Ford museum near Detroit the best North American museum regarding innovations. Lots of fabulous cars including 100 year old electric cars, but also brought together a group of buildings like Edison and Wright brothers garages. Ride around in a real Model T and kids can build a model t in a day.

1 次回应

Anuj Rastogi

Vice President, Client Partnerships at Massive Insights, Founder & Host at AwokenWord Podcast

11 个月

Your opening point Azeem Azhar is super important - “…. when the Big Tech firms saw the reception of ChatGPT, they immediately pivoted to developing their own AI products, with minimal care for risks.” We have to be mindful of the risks of this race, and not just cast them to the side to be dealt with later. That’s a central point Martin Ryan made throughout our conversation on AI. https://www.dhirubhai.net/posts/anuj-rastogi-22a0ab2_samaltman-chatgpt-podcast-activity-7133521290954473473-ObVv?utm_source=share&utm_medium=member_ios

3 次回应

Pedro Rocha

Artificial Intelligence ? Futures Studies ? Sustainability & Impact ? Ecosystem Builder

11 个月

Perfect! Let's all remember about the promise of world changing Google Duplex, in 2018, that is still not a reality.

1 次回应

查看更多评论

要查看或添加评论，请登录

Azeem Azhar的更多文章

?? What surprised me most after 500 editions of Exponential View

2024年11月18日

?? What surprised me most after 500 editions of Exponential View

Artwork by Moebius After nine years of writing Exponential View and 500 Sunday editions at technology’s frontier, I’ve…

4 条评论
?? Ten charts to understand the Exponential Age

2024年11月12日

?? Ten charts to understand the Exponential Age

This week marks the 500th edition of the Sunday newsletter. My aim all along has been to show that we live in…

10 条评论
?? The chip advantage

2024年11月4日

?? The chip advantage

This is an excerpt from my weekly newsletter, Exponential View. All new paying subscribers to Exponential View get 1…

3 条评论
?? My first, magical Waymo ride

2024年11月1日

?? My first, magical Waymo ride

After changing my view of self-driving cars by using my head and thinking through the data, I can confirm that my heart…

3 条评论
?? What would you do with an abundance of computing power?

2024年10月21日

?? What would you do with an abundance of computing power?

What would you do with 1000x more computing power? How would your organisation use it? If you were to ask these…

7 条评论
?? Will genAI cause a compute crunch?

2024年10月9日

?? Will genAI cause a compute crunch?

Last year, Google reached a milestone where its spending on compute exceeded its spending on people. This is a…

4 条评论
?? The foundations of future AI

2024年10月7日

?? The foundations of future AI

ChatGPT, Claude and other language models have dominated mainstream discussions and use. It’s not surprising: they’re…

4 条评论
?? AI, energy & industry round-up for September

2024年10月1日

?? AI, energy & industry round-up for September

Welcome to my September recap on AI, climate and energy transition, industry and economic trends. This summarises the…

5 条评论
Fastest tech in history

2024年9月30日

Fastest tech in history

?? THANK YOU for reading Exponential View. If you upgrade your membership today, you’ll get 1 year of FREE access to…

8 条评论
?? What is going on at OpenAI?

2024年9月27日

?? What is going on at OpenAI?

This was originally published earlier today in my newsletter Exponential View. If you become a paying member of…

7 条评论

See all articles

? Google's Gemini tells us a lot about the AI race

Azeem Azhar

Making sense of the Exponential Age

领英推荐

Exponential View on LinkedIn

193,127 位关注者

Azeem Azhar的更多文章

社区洞察

其他会员也浏览了

Which AI Platforms Will Breathe The Most Fire In 2024?! ????

Long-Term Memory: AI's Maturity from A Party Trick To An Organizational Asset

?? Pick GPT’s brain

Transactional use of AI sucks, here’s how to 10X your output.

2022 Roundup: Top 6 AI Products of the Year

### WEEK31 ###

When will 1 Billion People Use Generative AI? 2023-2024

AI's Impact: What to be worried? (and whatnots)

Google Unveils Gemini AI: Redefining AI Standards

When AI is not AI (i.e., all the time)

领英推荐

Exponential View on LinkedIn

193,127 位关注者

Azeem Azhar的更多文章

?? What surprised me most after 500 editions of Exponential View

?? Ten charts to understand the Exponential Age

?? The chip advantage

?? My first, magical Waymo ride

?? What would you do with an abundance of computing power?

?? Will genAI cause a compute crunch?

?? The foundations of future AI

?? AI, energy & industry round-up for September

Fastest tech in history

?? What is going on at OpenAI?

社区洞察

其他会员也浏览了

Which AI Platforms Will Breathe The Most Fire In 2024?! ????

Long-Term Memory: AI's Maturity from A Party Trick To An Organizational Asset

?? Pick GPT’s brain

Transactional use of AI sucks, here’s how to 10X your output.

2022 Roundup: Top 6 AI Products of the Year

### WEEK31 ###

When will 1 Billion People Use Generative AI? 2023-2024

AI's Impact: What to be worried? (and whatnots)

Google Unveils Gemini AI: Redefining AI Standards

When AI is not AI (i.e., all the time)