登录查看更多内容

Google takes its best shot against OpenAI with new Gemini AI models

Fast Company

Inspiring the future of business.

发布日期: 2023年12月6日

Welcome to AI Decoded, Fast Company’s weekly LinkedIn newsletter that breaks down the most important news in the world of AI. I’m Mark Sullivan , a senior writer at Fast Company, covering emerging tech, AI, and tech policy.

This week, I’m focusing on Google’s new Gemini large language model, the announcement of which is probably intended to convince Wall Street that Google still has its AI mojo. I’m also looking at an important new alliance of tech companies who are crusading against “closed” AI models. Also, some thoughts from Yann LeCun about the current AI hype.

If a friend or colleague shared this newsletter with you, you can sign up to receive it every week here . And if you have comments on this issue and/or ideas for future ones, drop me a line at [email protected] , and follow me on X (formerly Twitter) @thesullivan .

Google announces Gemini model, which can see and hear

Google was caught flat-footed when OpenAI suddenly released ChatGPT to the public a year ago, and the search giant has been furiously playing catch-up ever since. On Wednesday, Google announced its powerful new Gemini large language models (LLMs), which it says are the first built to process not just words but also sounds and images. Gemini was developed in part by the formidable brains at Google DeepMind, with involvement across the organization. Based on what I saw in a press briefing earlier this week, Gemini could put Google back at the front of the current AI arms race.

Google is releasing a family of Gemini models: a large Ultra model for complex AI tasks, a midsize Pro model for more general work, and a smaller Nano model that’s designed to run on mobile phones and the like. (In fact, Google plans to build Gemini into the Android OS for one of its phones next year.) The Ultra model “exceeds current state-of-the-art results'' on 30 of the 32 benchmarks commonly used to test LLMs, Google says. It also scored a 90% on a harder test called the Massive Multitask Language Understanding , which assesses a model's comprehension ability in 57 subject areas including math, physics, history, and medicine. Google says it's the first LLM to score better than most humans on the test.

The models were pretrained (allowed to process large amounts of training data on their own) using images, audio, and code. A Google spokesperson tells me the new models were pretrained using “data” from YouTube but didn’t say if they were pretrained by actually “watching” videos, which would be a major breakthrough. (OpenAI’s GPT-4 model is multimodal and can accept image and voice prompts.)

Models that can see and hear are a big step forward, in terms of functionality. When running on an Android phone, Gemini (the Nano version) could use the device’s camera and microphones to process images and sounds from the real world. Or, if Nano performs something like the larger models, it might be used to identify and reason about real-world objects it “sees” through the lenses of a future augmented reality headset (developed by Google or one of its hardware partners). That’s something Apple’s iPhone and Vision Pro VR headset probably won’t be able to deliver next year, though Meta is hard at work on XR headsets that perform this sort of visual computing.

During the press briefing Tuesday, Google screened a video showing Gemini reasoning over a set of images. In the video, a person placed an orange and a fidget toy on the table in front of a lens connected to Gemini. Gemini immediately identified both objects and responded with a clever commonality between the two items: “Citrus can be calming and so can the spin of a fidget toy,” it said aloud. In another video, Gemini is shown a math test where a user has handwritten their calculations to a problem. Gemini then identifies and explains the errors in the student’s calculations.

In the near term, Gemini’s powers can be experienced through Google’s Bard chatbot. Google says Bard will now be powered by the mid-tier Gemini Pro model, which it expects will give the chatbot better learning and reasoning skills. Bard will upgrade to the more powerful Gemini Ultra model next year, says Sissie Hsiao, Google’s VP/GM of Assistant and Bard. Developers and enterprise customers will be able to access and build on Gemini Pro via an API served from the Google Cloud starting December 13, a spokesperson said.?

Fast Company 1 年前

This AI newsletter is all you need #89

Towards AI 8 个月前

The Week of AI Drama

AIM 8 个月前

IBM and Meta join 50 other organizations in AI Alliance

IBM, Meta, and about 50 other tech companies and universities have set up a new organization to promote open-source AI. The organization, called the AI Alliance, aims to build and support open technology for AI, as well as the communities that will enable AI to benefit business and society.?

As the tech industry, regulators, researchers, and others struggle to understand the real risks of advanced AI models, a debate rages about whether it’s better to develop AI models out in the open so that anyone can understand and build on them; or to keep AI models secret in order to prevent bad actors from using the tech for nefarious purposes (say, flooding the web with misinformation ). While tech industry heavyweight OpenAI has become more secretive about its research as competition among AI developers has increased, other players like Meta and Hugging Face have been very vocal about the advantages of open source.?

The AI Alliance’s formation comes as lawmakers in Europe and the U.S. are working toward passing safety and responsibility regulations for AI developers. Many in the AI community hope the government’s eventual regulations won’t stifle quickly moving development. “A key part of this objective will be to advocate for smart policymaking that is focused on regulating specific applications of AI, not underlying algorithms,” an IBM spokesperson tells me.?

But the term “open source” means different things to different people. Meta often talks about open sourcing its models (like Llama ), but even it has been criticized for not opening certain details of its models to the developer community. Indeed, one of the Alliance’s main goals will be to define open source. “For example, creating a framework for releasing models with varying levels of openness and defining what those levels are and the standard requirements—these types of nuances are what AI Alliance members intend to tackle,” the IBM spokesperson said.

Yann LeCun: Today’s LLMs are dumber than your cat

Meta’s director of AI research, Yann LeCun, said last week at a Meta FAIR press event that LLMs have a long way to go to reach artificial general intelligence (meaning AIs that are smarter than humans in a wide array of tasks). “Those systems are stupider than your house cat,” LeCun told an assembled crowd.

LeCun, who helped invent the neural networks underpinning today’s LLMs, says today’s best AI systems still lack a basic understanding of the world needed to carry out useful tasks. They can’t yet execute higher-order actions like perceive, remember, reason, and plan. There isn’t yet a robot, he said, that can be trained to figure out how to clear the table after dinner and put the dishes in the dishwasher—a task a seven-year-old could perform. “AGI is not just around the corner,” LeCun added. “It’s going to take a lot of work.”?

The difficulty of AGI may lie within LLMs’ training technique. Most LLMs train by processing text content from the internet. They can learn a certain amount about the world by doing that, LeCun says, about as much as a baby knows. But babies can learn from their sense of sight and touch, a much richer flow of training data, if you will. Training transformer models using motion video would be a big improvement but researchers haven’t discovered how to do that yet, LeCun explained. A more robust way of training models might be the path to creating systems with enough common sense (including an understanding of the laws of physics) to handle more complex tasks on behalf of humans.

More AI coverage from Fast Company:

From around the web:?

Elon Musk is looking to raise $1 billion for xAI (TechCrunch)
AI experts roast the NYT for AI 'who's who' list containing zero women (Business Insider)
OpenAI Agreed to Buy $51 Million of AI Chips From a Startup Backed by CEO Sam Altman (Wired)
MIT and ETZ Zurich researchers invent AI that speeds up complex problem-solving (MIT News)

AI Decoded

217,146 位关注者

Wesley Morgan

Yes! At alwayspeople1.com: we believe that a politically informed and socially positive and activated online community will be our best users and followers.

11 个月

Connecting People in a different way “ONE CLICK AT A TIME!” Hello! Happy New Year to you and your team: Please find attached a newly launched social media platform?alwayspeople1.com?Promo:?Connecting People in a different way, “ONE CLICK AT A TIME!” We kindly asked that you share our link:?https://alwayspeople1.com/?with your associates and other social media connections. If you have a business and like to advertise with us - Here's alwayspeople1.com?Advertising Ad Campaign Packages for you: You can upload your Ads to?alwayspeople1.com?Ad portal by clicking on Business Promoter in the Profile Info on the website and enjoy the benefits of our Special Introduction Launched Packages. Yes, if you run your Ads for 3 months, 6 months, or 12 months at current sign-up rate, your Ad Rates will NOT increase over your existent Ads Run Period. Based on our Special Ad Rates, most of our Advertisers are signing up for our 12-month Ad Packages. Yes!?alwayspeople1.com - CONNECTING PEOPLE ACROSS AMERICA: “ONE CLICK AT A TIME!” Is the right social media platform for our time!! Thank you! Wesley Morgan, Founder and President

要查看或添加评论，请登录

Google takes its best shot against OpenAI with new Gemini AI models

Fast Company

Inspiring the future of business.

Google announces Gemini model, which can see and hear

领英推荐

IBM and Meta join 50 other organizations in AI Alliance

Yann LeCun: Today’s LLMs are dumber than your cat

More AI coverage from Fast Company:

From around the web:?

AI Decoded

217,146 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

What OpenAI’s wave of releases says about 2024

The Best AI Week Ever

??Anthropic lets Claude control your computer ?? Pika AI ?? Chipotle using AI for HR ??Why Miles Brundage Left OpenAI ?? Bret Taylor

OpenAI-mer

The Best AI Week Ever

AI Weekly Digest – August 19 2024

OpenAI-mer

AI Weekly: Claude 3.5 Sonet Shines, OpenAI Under Scrutiny

AI Pulse: Latest Updates from OpenAI, Anthropic, Google, and Beyond

A Week Full of AI Announcements from Google and OpenAI

Google announces Gemini model, which can see and hear

领英推荐

IBM and Meta join 50 other organizations in AI Alliance

Yann LeCun: Today’s LLMs are dumber than your cat

More AI coverage from Fast Company:

From around the web:?

AI Decoded

217,146 位关注者

Why we need to shift from a scarcity mindset to abundance now, more than ever

2024年11月21日

Don’t sign your severance agreement until you negotiate these 3 points

2024年11月20日

The 2024 Next Big Things in Tech

2024年11月19日

How to learn to speak more kindly to yourself

2024年11月18日

This is how to be an empathic leader during stressful times

2024年11月17日

How NASA’s X-59 recycled decades-old plane parts to make silent supersonic flight possible

2024年11月16日

A funny thing happened on the way to AGI: Model ‘supersizing’ has hit a wall

2024年11月15日

Trump could use this tactic to bypass approval for controversial administration picks

2024年11月15日

How to stop people-pleasing and start setting boundaries at work

2024年11月14日

Say goodbye to passwords

2024年11月13日

社区洞察

其他会员也浏览了

What OpenAI’s wave of releases says about 2024

The Best AI Week Ever

??Anthropic lets Claude control your computer ?? Pika AI ?? Chipotle using AI for HR ??Why Miles Brundage Left OpenAI ?? Bret Taylor

OpenAI-mer

The Best AI Week Ever

AI Weekly Digest – August 19 2024

OpenAI-mer

AI Weekly: Claude 3.5 Sonet Shines, OpenAI Under Scrutiny

AI Pulse: Latest Updates from OpenAI, Anthropic, Google, and Beyond

A Week Full of AI Announcements from Google and OpenAI