Large language models, or LLMs, can't be relied on to recall specific data they were trained on, so the way to make them work in the enterprise, where accuracy is paramount, is to feed them with the right data. This is called grounding prompts using retrieval augmented generation, or RAG for short.
- RAG relies on both keyword search (for structured data) and vector search (for unstructured data such as documents, call transcripts, videos, spreadsheets etc).
- Unstructured data is also sometimes referred to as blobs, or Binary Large Objects (data in binary form that may or may not conform to a specific file format). 80% of enterprise data is unstructured.
- Keyword search + vector search together is referred to as hybrid search. Hybrid search makes AI systems like Einstein Copilot very powerful in their ability to understand, generate outputs, and automate actions across a wide variety of use cases, contexts, and data/content types.
Why vectors? Unstructured data can't be stored in rows and columns in a relational database. It requires a different approach than SQL (or the Salesforce equivalents, SOQL and SOSL).
- Vectors are an efficient way of representing unstructured data. This matters both for quickly indexing/ performing similarity search (also known as semantic search) against prompts and also to efficiently pass large amounts of data in to LLMs given their limited context windows.
- Unstructured data requires much more storage and traditionally was difficult and slow to analyze or search. Enter LLMs - which are very good at understanding the most important, defining attributes of data blobs to pay attention to - these become the vector dimensions. All other dimensions are collapsed/ignored.
- A smaller LLM dedicated to vectorizing unstructured data called an embeddings model is used to create the vectors. The embeddings model is different from the LLM that's used to generate outputs (into which the vectors are passed).
- Vector embeddings aren't new. Google search has used embeddings for years. But LLMs make vector embeddings both possible and mission-critical for AI applications.
In the coming months and years, every organization and even individuals will need vector databases in order to overcome the limitations of LLMs -- including limited context windows, knowledge cutoff dates, and hallucinations -- and effectively utilize generative AI.
Head of IT ? Seasoned VP of Enterprise Business Technology ? Outcome Based Large Scale Business Transformation (CRM, ERP, Data, Security) ? KPI Driven Technology Roadmap
5 个月Clara, Nice! Thanks for sharing!
AI for All
9 个月Deals with a real problem.
TOP#25 Best Writers: 19th Global Rank in 2023-2024 | Content Writer/Editor | Creative Copywriter | Humor Marketing Writer | Research/Technical Writer | Health/Pharma Writer | Sales/Marketing Writer | German/French Writer
9 个月???????????? ?????????? BELOW LINKEDIN POST LINK ???? ???????????????? ????????????????: ???????????????? ???? ?????? ?????????????? ???????????????????? ?????????????????? ???????????????????????? ?????? ?????????? ????????????????, ?????? ???????? ????????????????, ?????? ?????????? ?????? ?????? ???????????????? ???? ??????????: https://www.dhirubhai.net/posts/sharmakalpesh_todays-content-title-%3F%3F%3F%3F%3F%3F%3F-%3F%3F%3F%3F%3F%3F%3F%3F%3F%3F-activity-7150763464024539137-iPPK
Transforming Business with Ethical AI ?? WorkDifferentWithAI.com/sign-up ?? Sr. Industry Analyst at SalesforceDevops.net
10 个月Once again. Salesforce leads the enterprise AI race by integrating RAG into their prompt architecture. RAG is all the rage. Using RAG lets you do the thing many people are demanding, which is “how do I use ChatGPT with documents from my company?“ Both Microsoft and Amazon spent December explaining how they are integrating RAG into their enterprise cloud architectures. And the new GPT, available from open AI, also accomplish roughly the same thing. That’s why RAG appears to be the number one orchestration pattern in Enterprise AI today.