登录查看更多内容

The Gemini family tree

Google Cloud

Welcome to the new way to cloud.

发布日期: 2024年6月9日

Last month at Google I/O, we introduced Gemini 1.5 Flash, the latest model in the growing Gemini family. We asked Hamidou Dia , vice president for applied engineering at Google Cloud, to explain a bit about all the different models that now belong to the Gemini family tree, when and where to use them, and why Gemini stands out from other AIs. (This post originally appeared as part of Google Cloud’s monthly executive insights email newsletter — which you can sign-up for here.)

The Gemini family is a big one, and it just keeps growing. And like any family, each member has its own strengths and personalities. Gemini 1.5 Flash is the newest of the bunch, and one of our most capable offerings yet. What’s so special about Flash and all its relatives? What makes each of them — Gemini 1.5 Pro, Gemini 1.0 Nano, Gemini 1.0 Pro, and Gemini 1.0 Ultra, as well as their cousin Gemma, the open model — different??

Or, what you’re really wondering: Which of them is right for your business or specific applications?

Rarely are any two AI use cases the same, and those use cases keep growing in number and maturity each day. It takes a wide range of models to satisfy these different needs, and that might even include another family of models altogether, like Anthropic’s Claude or the open source Mistral models. This diversity of needs is why Google Cloud has taken a truly open approach since day one for our model offerings and capabilities, highlighted by Vertex AI’s Model Garden and its selection of more than 150 first-party, third-party, and open models.

One of the most important considerations across the latest Gemini models, and what sets them apart from the competition, is their long context window. When we announced Gemini 1.5 Pro in February, it was the first widely available model with not only a context window of 1 million tokens, but also near-perfect recall across large amounts of input data. At I/O, Sundar Pichai, Google’s CEO, revealed that that context would expand to 2 million tokens. He even remarked that this was part of the pathway to “infinite context.”

Want to try Gemini 1.5 Pro for yourself? Check it out now in the Google Cloud console.

A token is fundamentally the smallest segment that a piece of data can be broken down into for use in a particular model. This could be thought of as a letter or character, but depending on the configuration of both the model and the data, these tokens could be as large as a word or phrase. The larger the context window, the more a model can process and compare information without “forgetting” what has already been processed or prompted.?

If your context window only covers a few thousand tokens, maybe the model could understand a single whitepaper or a few emails. When it gets into the millions, that’s enough processing power to understand and analyze entire books or movies or, more practically for the enterprise, entire codebases, large financial datasets and research reports, or hours of footage from a manufacturing floor and a shelf’s worth of production manuals.

That’s where things really get interesting, when you start to combine some of these materials. The other important aspect of Gemini is that all the models are natively multimodal. Previous generations of models could maybe identify an image or video while also deciphering text or code, but that was basically shuttling the information between a set of sub-models. Gemini was developed from the start to handle a range of information types, just as a person normally would.

This means less latency and energy usage and better results for queries involving multiple sources and types of information. A manufacturing company, for example, could upload those manuals and potentially use them to spot dangers or inefficiencies in the factory footage by seamlessly cross-referencing the two. Or an investment firm could upload an investor call, regulatory filings, and references to social media and combine them for investment insights.

领英推荐

The Future of Artificial Intelligence, Cloud…

Bernard Marr 3 年前

$13b Run Rate & Doubling

Tomasz Tunguz 1 个月前

Microsoft Challenges Google with OpenAI Partnership

Sramana Mitra 2 年前

This is where the family of models becomes so important. For the most lightweight application on a mobile phone or edge device, there’s Gemini 1.0 Nano. Gemini 1.0 Pro is the mid-weight model with a context window and features optimized for common tasks and scale, while Gemini 1.0 Ultra tackles more complex and demanding tasks. Our Gemini 1.5 models step up with context windows of 1-million+ tokens and native multi-modal reasoning. Gemini 1.5 Flash — which offers our best combination of long context capabilities, advanced analysis, and low latency — will now serve most enterprise applications, though there are some of the most advanced needs that will require the full power of Gemini 1.5 Pro. And for those who need an open model for greater flexibility or access, Gemma, our family of open models, is at the ready.

It’s a big family, ready to get to work.

Speaking of the capabilities of our models, underlying infrastructure, and enterprise tooling in Vertex AI Platform, we’re excited to share that Google was named a Leader in The Forrester Wave?: AI Foundation Models for Language, Q2 2024. Google received the highest scores of all vendors evaluated in the Current Offering and Strategy categories, with Forrester noting:

“Gemini is uniquely differentiated in the market especially in multimodality and context length while also ensuring interconnectivity with the broader ecosystem of complementary cloud services.”?

带有此图标的链接由领英创建，不带此图标的链接由作者添加。

Transformation Today

1,238,281 位关注者

Yusri Kassim

Senior Software Development Engineer

6 个月

My work preference easy simply with ai.

Adam Hiber

Senior IT Business Analyst | Project Manager - (PMP | CSM | CSPO | CISM)

9 个月

Google Cloud's latest AI model, Gemini 1.5 Flash, is now part of the expanding Gemini family. Each model, from Gemini 1.0 Nano to Gemini 1.5 Pro, offers unique strengths for diverse AI applications. The standout feature of these models is their long context window, now reaching up to 2 million tokens, allowing them to process and recall vast amounts of data efficiently. Additionally, their native multimodal capabilities ensure seamless integration of various data types for superior performance.?

Imran Arain

9 个月

amazing contratulations

KIMBERLY NEU

Strategic Enterprise Territory Executive at Google

9 个月

Great explanation of the Gemini Family!

1 次回应

Ben Thomas

9 个月

We do not all have Business Addresses to fill the mandatory Field to request a r download a complimentary copy of the full report.

查看更多评论

要查看或添加评论，请登录

Google Cloud的更多文章

See all articles

The Gemini family tree

Google Cloud

Welcome to the new way to cloud.

领英推荐

Transformation Today

1,238,281 位关注者

Google Cloud的更多文章

社区洞察

其他会员也浏览了

Profit Dollars per GPU Dollar

AI Reaccelerating Cloud Growth

Convergence: AI’s Impact on Competition and Regulation in the Cloud Market

Lessons from My First AWS re:Invent

Azure Warda for IAM de-escalation

Cloudy skies: Cloud providers are uniquely vulnerable due to LLMs

Journey from Client to Server Farm: A Somewhat Technical Exploration

The Amazon AWS GenAI Strategy Comes with a Big Q

Run Away Costs of the Serverless?

THE GCU (Google Cloud Updates) V - 1.030

领英推荐

Transformation Today

1,238,281 位关注者

Google Cloud的更多文章

AI and what it means for customer experience

How Google manages vulnerability detection and remediation

A deep dive into TPU efficiency and lifecycle emissions

How Google makes threat detection high-quality, scalable, and modern

Gemini 2.0 and the agentic era

How to pay down the high security cost of legacy tech

The platform priority

To make AI more secure, AI vendors should share their vulnerability research

How to choose the right gen AI models + 185 real world use cases

Why grounding is the foundation of successful enterprise AI

社区洞察

其他会员也浏览了

Profit Dollars per GPU Dollar

AI Reaccelerating Cloud Growth

Convergence: AI’s Impact on Competition and Regulation in the Cloud Market

Lessons from My First AWS re:Invent

Azure Warda for IAM de-escalation

Cloudy skies: Cloud providers are uniquely vulnerable due to LLMs

Journey from Client to Server Farm: A Somewhat Technical Exploration

The Amazon AWS GenAI Strategy Comes with a Big Q

Run Away Costs of the Serverless?

THE GCU (Google Cloud Updates) V - 1.030