RAGs to?Riches
Retrieval Augmented Generation with Query Expansion
If you haven’t heard of retrieval augmented generation (“RAG”), it is absolutely blowing up the AI space.
In this article, I give a couple interesting ideas to boost the “retrieval” part of your system. In other words, the “search” that finds the subset of your custom, private data that is most closely related to your question. This data is then included in the prompt for the LLM to use when responding.
The usual RAG approach is to reroute the user’s query away from the LLM -by first taking a detour in the form of a search across your: documents, knowledge bases, and other custom data. You then take the most relevant text chunks from the search and instruct the LLM to use only that context to compose its answer. (See Figure 2 below. This approach can use either a traditional search or a separate AI-powered search using vector-embeddings.)
1) HyDE Queries
The first new and interesting tweak I have for you today is called hypothetical document embeddings, or HyDE. The idea behind HyDE is that a search will yield better results if you use an LLM to create a hypothetical, hallucinated answer… append it to the query… and submit the combined string to the search instead of just the query by itself (For more detail, see this excellent interview from Sam Partee of Redis).
This makes sense because?—?even with an answer based on the random, general knowledge which a free-flowing LLM will fabricate?—?the combined query string will more than likely gain some rich semantic information and keywords.
Here’s an example: Say a new employee asks the onboarding chatbot the question: “Who can get me a Salesforce login?”…
# Query for Standard Retrieval:
Query1 = "Who can get me a Salesforce login?"
…but your company doesn’t even use Salesforce. Instead, your company uses Hubspot…so the search/retrieval step is not going to produce good keyword matches from your PDFs, policies, and procedures.
领英推荐
But if you submit this question before the search/retrieval step, and let an LLM run free on an arbitrary, made-up response, you might get a combined string like this:
# Query for HyDE-modified Retrieval:
Query2 = """Who can get me a Salesforce login?
A: To gain access to the Salesforce customer relationship
manager (CRM) and interact with client and prospecting
data, you can contact the Sales Operation department or
the IT department by email or phone at 555–5555."""
Submitting the combined question+answer pair to the search/retrieval is going to use a lot more meaty keywords and context to pull from relevant documents. (On the downside, you do have to run the LLM an extra time before each RAG query, but in many instances, the accuracy-vs-latency tradeoff may be worth it.)
2) Full, AI-Powered Query Expansion
So when I heard about this, I realized this is actually a new riff on a tried and true staple of search known as query expansion. So I thought: Hey, what if you explicitly asked the LLM to perform a full query expansion? You could ask it to:
Here’s an example:
# Full, AI-Powered Query Expansion:
Query3 = """Who can get me a Salesforce login?
A: To gain access to the Salesforce customer relationship
manager (CRM) and interact with client and prospecting
data, you can contact the Sales Operation department or
the IT department by email or phone at 555–5555.
Related Keywords: ["sales", "marketing", "customer relationship",
"password", "authentication"]
Acronyms: [("CRM", "customer relationship management"),
("IT", "information technology")]
"""
Now what’s even more interesting about using a full, AI-powered query expansion?like this—?you might be able to attain enough semantic information that you can reliably drop down to a traditional keyword-based search algorithm like BM25, which is much cheaper and faster than an AI vector-based search, while still achieving comparable system accuracy. So that’s where the payoff could lie.
If you’re experimenting with RAG architectures, or just starting to think about AI and LLMs, I’d love to hear your thoughts and questions in the comments! Thanks!
Chief AI Officer at Invisibly
9 个月Video version: https://www.dhirubhai.net/posts/davidcostenaro_ai-llm-search-activity-7166068261061574659-Oxw1?utm_source=share&utm_medium=member_ios