Why grounding is the foundation of successful enterprise AI
Business leaders are buzzing about generative AI. To help you keep up with this fast-moving, transformative topic, our regular column “The Prompt” brings you observations from the field, where Google Cloud leaders are working closely with customers and partners to define the future of AI. In this edition, Warren Barkley , Vertex AI product leader, looks at how organizations can ground gen AI models in factual data to provide more reliable responses that build trust and confidence in all users, whether customers or employees.
Generative AI is a business game-changer, but with a notable caveat: it needs access to real-world context and information to be truly useful in the enterprise.
Though powerful, gen AI models don’t come primed for your industry or know the inner workings of your business. They are limited to what they know from their training data, which often lacks the necessary information and domain expertise for specific business tasks and use cases. Models also have a knowledge cutoff, meaning they are not aware of new developments and information that occurs beyond their training. More critically, these gaps in knowledge can contribute to models generating irrelevant, factually incorrect, or, in rarer cases, hallucinations — completely made up responses.
In other words, foundation models are trained to predict the most probable answer based on training data, but that’s not the same thing as citing facts. To unlock the full potential of gen AI, organizations need to ground model responses in what we call “enterprise truth“ — fresh, real-time data and enterprise systems. This approach lets models retrieve the context they need from external systems, so they can find the latest, most relevant information instead of relying on their limited and potentially outdated training knowledge.
Over the last year, grounding has come to the forefront in many of our conversations with customers and partners alike, especially as more and more move from experimenting with gen AI to putting AI into production. Increasingly, executives are realizing foundation models are simply a starting point, and they are exploring how to use grounding approaches like retrieval augmented generation (RAG) to add context from reliable information sources, their own first-party data, and the latest information from the web.
In this column, we’ll explore some of the benefits and challenges of RAG for grounding models, and how they relate to the solutions we’re building at Google Cloud.
Bringing facts to abstract knowledge
In general, the quickest and easiest way to provide additional background information to models is through prompting. It’s possible to give models an extensive amount of information, with context windows — the amount of information a model can recall before it starts “forgetting” previous interactions — now reaching up to a staggering two million tokens.
For example, you could put an entire employee handbook into a long context window and create a context cache to handle multiple prompts for the same information, which can pull relevant information for more accurate model output, similar to RAG. However, manual efforts don’t scale well, especially when querying frequently updated data or a large corpus of enterprise knowledge bases.
To automatically enable gen AI models to retrieve relevant, factual information, you will need some form of grounding solution, such as RAG.
Suppose you want to create an AI agent to help employees choose the right benefits package to fit their needs. Without grounding, the agent would only be able to generally discuss how the most common employee benefits work based on its training data, but it wouldn’t have any awareness about the benefits your organization offers. Plus, even if an agent’s training data included all your employee benefits, individual programs change all the time; it’s likely a model would quickly fall out of date without any way to reference newly updated information.
In this scenario, RAG could connect your models to plan policies, detailed benefit summaries, carrier contracts, and other relevant documentation, allowing agents to answer specific questions, provide recommendations, or enroll employees directly in online portals — all without the need to retrain or fine-tune models.
These advantages are why RAG is now the primary approach being pursued by organizations seeking to ground their gen AI applications and agents in enterprise truth. With an increasingly robust ecosystem of products and integration available, we’re seeing more and more possibilities emerge that can help tackle use cases that demand domain-specific knowledge and deep contextual awareness. There are several different ways to incorporate RAG into gen AI, ranging from simpler approaches like linking models directly to the internet for recency to more complex custom-built RAG systems.
Continue reading on Transform with Google Cloud.
Catch up on the latest announcements
We recently announced several updates to make Gemini, Meta and Mistral models more accessible, followed shortly by announcing the Jamba 1.5 Model Family from AI21 Labs.
Lower pricing for Gemini 1.5 Flash
More Languages for Gemini
Bringing more models to Vertex AI Model Garden
Read Your ultimate guide to the latest in generative AI on Vertex AI to catch up on more recent announcements.?
--
4 周0
Office Incharge at ANY
1 个月Good
Driving Organizational Success through #Talent Management, Business Consultant, Digital Growth and Strategic #Leadership
2 个月This is true that Ai is a useful and very powerful invention but still it is invented by the human mind so it may solve problems and reduce our work but it cannot replace human
étudiant(e) à SUPINFO International University
2 个月$BARD = 0; While ($BARD < 100 {$BARD++ echo $BARD++;}
étudiant(e) à SUPINFO International University
2 个月C'est bien d'aller très vite dans la prosgression de L'IA mais va falloir être rapide dans les désinformations des liens anxiogènes dans le Web BARD.