From Vector databases to hybrid RAG for Enterprise Gen AI
Vector databases have become synonymous with Retrieval Augmented Generation (RAG) for LLM applications in recent discourse. The embeddings approach that underpins these databases makes a lot of sense for a general-purpose Gen AI solution like ChatGPT; vector matching provides a broad, content-agnostic solution for identifying relevant information about any conceivable topic.
The use case for internal AI agents within an Enterprise is very different, in three important ways:?
As foundation models increasingly become commoditized and prompt optimization gets systematized, e.g. using DSPy, the value differentiation of an AI system will shift from its inference engine to its “data power”, i.e. how well it understands the user request and locates relevant data across disparate data sources to provide in context. Equally important, it needs to seamlessly absorb new data as it is created so that the AI system continues to “learn” over time.
领英推荐
On the technology side, data within an Enterprise is typically distributed across many different data sources: transactional SQL systems, NoSQL/object databases, graph databases, file systems, APIs, etc. In addition, domain-specific taxonomies and rich metadata can add semantic depth to the search.
At ADS, we’re designing a hybrid RAG system that’s carefully designed to fetch data from this rich variety of data sources using multiple technologies (SQL queries, knowledge graphs, keyword search, vector search, etc.) and then to stitch the results together appropriately to feed into the LLM prompt.?
This sophisticated RAG system then becomes the backbone of the AI system, the growing knowledge base that enables AI agents to provide correct answers.