登录查看更多内容

Why grounding is the foundation of successful enterprise AI

Google Cloud

Welcome to the new way to cloud.

发布日期: 2024年9月10日

Business leaders are buzzing about generative AI. To help you keep up with this fast-moving, transformative topic, our regular column “The Prompt” brings you observations from the field, where Google Cloud leaders are working closely with customers and partners to define the future of AI. In this edition, Warren Barkley , Vertex AI product leader, looks at how organizations can ground gen AI models in factual data to provide more reliable responses that build trust and confidence in all users, whether customers or employees.

Generative AI is a business game-changer, but with a notable caveat: it needs access to real-world context and information to be truly useful in the enterprise.

Though powerful, gen AI models don’t come primed for your industry or know the inner workings of your business. They are limited to what they know from their training data, which often lacks the necessary information and domain expertise for specific business tasks and use cases. Models also have a knowledge cutoff, meaning they are not aware of new developments and information that occurs beyond their training. More critically, these gaps in knowledge can contribute to models generating irrelevant, factually incorrect, or, in rarer cases, hallucinations — completely made up responses.

In other words, foundation models are trained to predict the most probable answer based on training data, but that’s not the same thing as citing facts. To unlock the full potential of gen AI, organizations need to ground model responses in what we call “enterprise truth“ — fresh, real-time data and enterprise systems. This approach lets models retrieve the context they need from external systems, so they can find the latest, most relevant information instead of relying on their limited and potentially outdated training knowledge.

Over the last year, grounding has come to the forefront in many of our conversations with customers and partners alike, especially as more and more move from experimenting with gen AI to putting AI into production. Increasingly, executives are realizing foundation models are simply a starting point, and they are exploring how to use grounding approaches like retrieval augmented generation (RAG) to add context from reliable information sources, their own first-party data, and the latest information from the web.

In this column, we’ll explore some of the benefits and challenges of RAG for grounding models, and how they relate to the solutions we’re building at Google Cloud.

Bringing facts to abstract knowledge

In general, the quickest and easiest way to provide additional background information to models is through prompting. It’s possible to give models an extensive amount of information, with context windows — the amount of information a model can recall before it starts “forgetting” previous interactions — now reaching up to a staggering two million tokens.

For example, you could put an entire employee handbook into a long context window and create a context cache to handle multiple prompts for the same information, which can pull relevant information for more accurate model output, similar to RAG. However, manual efforts don’t scale well, especially when querying frequently updated data or a large corpus of enterprise knowledge bases.

To automatically enable gen AI models to retrieve relevant, factual information, you will need some form of grounding solution, such as RAG.

Suppose you want to create an AI agent to help employees choose the right benefits package to fit their needs. Without grounding, the agent would only be able to generally discuss how the most common employee benefits work based on its training data, but it wouldn’t have any awareness about the benefits your organization offers. Plus, even if an agent’s training data included all your employee benefits, individual programs change all the time; it’s likely a model would quickly fall out of date without any way to reference newly updated information.

In this scenario, RAG could connect your models to plan policies, detailed benefit summaries, carrier contracts, and other relevant documentation, allowing agents to answer specific questions, provide recommendations, or enroll employees directly in online portals — all without the need to retrain or fine-tune models.

These advantages are why RAG is now the primary approach being pursued by organizations seeking to ground their gen AI applications and agents in enterprise truth. With an increasingly robust ecosystem of products and integration available, we’re seeing more and more possibilities emerge that can help tackle use cases that demand domain-specific knowledge and deep contextual awareness. There are several different ways to incorporate RAG into gen AI, ranging from simpler approaches like linking models directly to the internet for recency to more complex custom-built RAG systems.

Continue reading on Transform with Google Cloud.

Catch up on the latest announcements

We recently announced several updates to make Gemini, Meta and Mistral models more accessible, followed shortly by announcing the Jamba 1.5 Model Family from AI21 Labs.

Lower pricing for Gemini 1.5 Flash

What it is: We've updated Gemini 1.5 Flash to reduce the input costs by up to ~85% and output costs by up to ~80%, starting August 12th, 2024.?
Why it matters: This is a big price drop on Gemini Flash, a world-class model with a 1 million context window and multi-modal inputs. Plus, coupled with capabilities like context caching you can significantly reduce the cost and latency of your long context queries. Using Batch API instead of standard requests can further optimize costs for latency insensitive tasks.?
Get started: View pricing to learn more and try out Gemini 1.5 Flash today.

More Languages for Gemini

What it is: We're enabling Gemini 1.5 Flash and Gemini 1.5 Pro to understand and respond in 100+ languages.?
Why it matters: We’re making it easier for our global community to prompt and receive responses in their native languages.
Get started: View documentation to learn more.

Bringing more models to Vertex AI Model Garden

What it is: Llama 3.1 models, Mistral AI’s latest models, and the Jamba 1.5 Model Family from AI21 Labs are now available on Model Garden.?
Why it matters: Vertex AI Model Garden offers a collection of 150 models to provide choice and flexibility for your needs and budget. Plus, you can easily access these models in just a few clicks using Model-as-a-Service, without any setup or infrastructure hassles.?
Get started: Visit Model Garden

Read Your ultimate guide to the latest in generative AI on Vertex AI to catch up on more recent announcements.?

Transformation Today

1,157,152 位关注者

Бекхан Делиев

4 周

VIRAJ UPADHAYAY

Office Incharge at ANY

1 个月

Good

Akanksha Srivastava

Driving Organizational Success through #Talent Management, Business Consultant, Digital Growth and Strategic #Leadership

2 个月

This is true that Ai is a useful and very powerful invention but still it is invented by the human mind so it may solve problems and reduce our work but it cannot replace human

Grégory Chathuant

étudiant(e) à SUPINFO International University

2 个月

$BARD = 0; While ($BARD < 100 {$BARD++ echo $BARD++;}

Grégory Chathuant

étudiant(e) à SUPINFO International University

2 个月

C'est bien d'aller très vite dans la prosgression de L'IA mais va falloir être rapide dans les désinformations des liens anxiogènes dans le Web BARD.

查看更多评论

要查看或添加评论，请登录

Google Cloud的更多文章

See all articles

Bringing facts to abstract knowledge

Catch up on the latest announcements

Transformation Today

1,157,152 位关注者

Google Cloud的更多文章

How to pay down the high security cost of legacy tech

The platform priority

To make AI more secure, AI vendors should share their vulnerability research

How to choose the right gen AI models + 185 real world use cases

All the other great AIs

10 years of AI-specialized chips

Why Google is working to improve rural healthcare cybersecurity

AI for marketing, from hype to how

The Gemini family tree

CISOs should be asking—and answering—these AI questions