Building Data Moats in the age of LLMs
LLMs as decision-making systems
While the landscape of LLMs is changing rapidly, with new tools emerging almost on a weekly basis, one theme is starting to become increasingly clear - LLMs need to be looked at as reasoning and decision-making engines and not merely Content Generators. This idea was in its infancy when LangChain, AutoGPT, BabyAGI and other open source projects started building on the concept of “Agents”. However with OpenAI’s release of function calling earlier this year and the recent release of the Assistant API - this concept is gaining more attention and traction. In this role, a LLM acts like a central “brain” that accepts a goal, a user’s input (optional), the set of tools it has access to and some contextual information (eg: the conversational history and information relevant to the conversation) - it then generates the next relevant response or action for that user. A key challenge of working with LLMs in this fashion has been its stateless nature - an LLM on its own has no concept of “memory”. LangChain addressed this issue with the “Conversation Memory” framework and most recently OpenAI introduced “threads” in their api to tackle it. But retaining conversational history alone is not going to transform these LLMs into powerful reasoning and decision making engines - these systems need the ability to fetch relevant information to reason and act according to the goal assigned to them in real time. The ML community’s answer to this has been the Retrieval Augmented Generation framework. RAG has so far primarily been used for knowledge base queries in which the LLM acts as a Content Generator of sorts.? Given the evolving role of LLMs towards becoming decision-making engines, businesses now have an interesting opportunity to start using their data in a strategic way.
Re-Thinking RAG as a context engine
A generative AI system that is tasked with decision-making must have insights that go beyond static information — it needs dynamic, situational awareness. For instance, knowing if an item is in stock, or the availability of seats on a particular flight, requires real-time data retrieval and synthesis. Here, the RAG system acts as a conduit between vast data repositories, backend systems and the LLM, channeling relevant information(context) to enable informed decisions. This synthesis must also align with stringent data security and privacy standards, adding layers of complexity to the RAG process. In this new landscape, the importance of the context retrieval process is as important as the language model itself.
Yet, this is not simply about fetching data; it's about understanding the user’s request, merging multiple data sources to augment the context, injecting this along with the prompt and thus sharpening the model’s decision-making acumen. The more up-to-date and relevant details the system can gather regarding the task or query the user is interested in, the more adept it becomes in helping the user achieve their goals. This enhancement of the AI's capabilities makes the system more valuable, encouraging the user to rely on it more frequently. This increased usage, in turn, fuels the system with additional data, creating a virtuous cycle that continually improves the system's performance - a data flywheel. Moreover, infusing the RAG Context Engine with the latest data is generally simpler, less resource-intensive and can be in near-real time rather than fine-tuning the models themselves, which demands significant computational power and time.
领英推荐
RAG provides a strategic opportunity?
With LLMs reducing the barrier to entry for building Generative AI products -carving out unique features in such products has become increasingly challenging. However, the RAG framework emerges as a bright spot in this landscape. It enables differentiation by integrating real-time and proprietary data into the LLM, thereby expanding the AI's understanding and adaptability to specific user needs. This provides an interesting formula for businesses to stand out in a crowded market.
The RAG setup provides another edge for enterprises. It allows the implementation of smaller, specialized LLMs that deliver efficiency and cost savings while scaling their operations. Finally, since RAG systems are interwoven with both rule-based retrievers and machine learning elements like embeddings, reranking, and more, it provides some amount of determinism in the overall system. This transparency simplifies troubleshooting and can provide some explainability in an otherwise black box LLM system.?
Overall, RAG provides an interesting opportunity for enterprises to use their existing data, build great user experiences and thereby create differentiation.?
Director of Product Marketing @ MoEngage
1 年Well written article Anirudh Shenoy ..i understand that the RAG framework facilitates not just fetching of the data, but also the contextual augmentation and decision making for LLMs..shall we presume the framework can be seen bimodal interms of real time updation as well of the enterprise data in additon to fetching it with contextuality?
Field Marketer | Niti Aayog LiFE Top 75 | Aspiring social entrepreneur
1 年Interesting read, Anirudh! As someone who's learning about tech, and I don't know if this question makes sense fully but, I'd like to know whether companies have explored implementing RAGs instead of fine-tuning. If not, what's the blocker? Would appreciate it if you could explain it like I'm 5 ??