RAGs to?Riches
Figure 1. RAGs to Riches. Generated with ChatGPT/Dall-E.

RAGs to?Riches

Retrieval Augmented Generation with Query Expansion

If you haven’t heard of retrieval augmented generation (“RAG”), it is absolutely blowing up the AI space.

In this article, I give a couple interesting ideas to boost the “retrieval” part of your system. In other words, the “search” that finds the subset of your custom, private data that is most closely related to your question. This data is then included in the prompt for the LLM to use when responding.

The usual RAG approach is to reroute the user’s query away from the LLM -by first taking a detour in the form of a search across your: documents, knowledge bases, and other custom data. You then take the most relevant text chunks from the search and instruct the LLM to use only that context to compose its answer. (See Figure 2 below. This approach can use either a traditional search or a separate AI-powered search using vector-embeddings.)

Figure 2. Process Flow for Standard LLM queries and Retrieval Augmented Generation (RAG) queries.

1) HyDE Queries

The first new and interesting tweak I have for you today is called hypothetical document embeddings, or HyDE. The idea behind HyDE is that a search will yield better results if you use an LLM to create a hypothetical, hallucinated answer… append it to the query… and submit the combined string to the search instead of just the query by itself (For more detail, see this excellent interview from Sam Partee of Redis).

This makes sense because?—?even with an answer based on the random, general knowledge which a free-flowing LLM will fabricate?—?the combined query string will more than likely gain some rich semantic information and keywords.

Here’s an example: Say a new employee asks the onboarding chatbot the question: “Who can get me a Salesforce login?”…

# Query for Standard Retrieval:
Query1 = "Who can get me a Salesforce login?"        

…but your company doesn’t even use Salesforce. Instead, your company uses Hubspot…so the search/retrieval step is not going to produce good keyword matches from your PDFs, policies, and procedures.

But if you submit this question before the search/retrieval step, and let an LLM run free on an arbitrary, made-up response, you might get a combined string like this:

# Query for HyDE-modified Retrieval:
Query2 = """Who can get me a Salesforce login?
A: To gain access to the Salesforce customer relationship 
manager (CRM) and interact with client and prospecting 
data, you can contact the Sales Operation department or
the IT department by email or phone at 555–5555."""        

Submitting the combined question+answer pair to the search/retrieval is going to use a lot more meaty keywords and context to pull from relevant documents. (On the downside, you do have to run the LLM an extra time before each RAG query, but in many instances, the accuracy-vs-latency tradeoff may be worth it.)

2) Full, AI-Powered Query Expansion

So when I heard about this, I realized this is actually a new riff on a tried and true staple of search known as query expansion. So I thought: Hey, what if you explicitly asked the LLM to perform a full query expansion? You could ask it to:

  1. Make up a best-guess answer, just as in HyDE, but then also…?
  2. Include related keywords that would optimize a search… and…
  3. You could even reference a list of company acronyms and abbreviations which could be expanded or defined if appearing in the search string as well.?

Here’s an example:

# Full, AI-Powered Query Expansion:
Query3 = """Who can get me a Salesforce login?

A: To gain access to the Salesforce customer relationship 
manager (CRM) and interact with client and prospecting 
data, you can contact the Sales Operation department or
the IT department by email or phone at 555–5555.

Related Keywords: ["sales", "marketing", "customer relationship",
 "password", "authentication"]

Acronyms: [("CRM", "customer relationship management"), 
("IT", "information technology")]
"""        

Now what’s even more interesting about using a full, AI-powered query expansion?like this—?you might be able to attain enough semantic information that you can reliably drop down to a traditional keyword-based search algorithm like BM25, which is much cheaper and faster than an AI vector-based search, while still achieving comparable system accuracy. So that’s where the payoff could lie.

If you’re experimenting with RAG architectures, or just starting to think about AI and LLMs, I’d love to hear your thoughts and questions in the comments! Thanks!


Dave Costenaro

Head of AI @ Invisibly | AI & Data Leader | Bridging Tech & Strategy | Ex-Boeing, Ameren, Protocol Labs

1 年
回复

要查看或添加评论,请登录

Dave Costenaro的更多文章

  • Detecting Fraud in Market Research Surveys

    Detecting Fraud in Market Research Surveys

    At Invisibly, we have found it highly effective to aggregate market research insights by rewarding survey participants…

  • LLM Faceoff: Comparing Today’s Leading Intelligence APIs

    LLM Faceoff: Comparing Today’s Leading Intelligence APIs

    Large Language Models (LLMs) began powering AI products and workflows by API about 2 years ago. Since then, I’ve used…

    7 条评论
  • Migrating Code with LLMs

    Migrating Code with LLMs

    Much of what has been written about AI code generation deals with the authoring of new code. However, tremendous…

    3 条评论
  • Accelerate Search Testing with?AI

    Accelerate Search Testing with?AI

    LLMs Are Surprisingly Good at User Testing I have a story for you about a fascinating new way to use large language…

  • AI Regulation and Risk for St. Louis Businesses

    AI Regulation and Risk for St. Louis Businesses

    I recently had a thought provoking conversation about artificial intelligence regulations with David Nicklaus, St…

  • “IBM Project Debater” Squares Off Against Human Debate Champion

    “IBM Project Debater” Squares Off Against Human Debate Champion

    This month, IBM's "Project Debater", an Artificial Intelligence system out of Big Blue's AI research labs, squared off…

  • AI in the Midwest

    AI in the Midwest

    The following are my opening remarks from our recent Prepare.ai Conference.

    2 条评论
  • GDPR and Machine Learning Black Boxes

    GDPR and Machine Learning Black Boxes

    General Data Privacy Regulation (GDPR) in Europe has a provision that any AI or ML algorithm that is used for…

  • Game of Thrones, AI, and Family?Legacy

    Game of Thrones, AI, and Family?Legacy

    My wife and I just binge-watched all available seasons of Game of Thrones. One thing that struck me was the show’s deep…

  • Preparing for Artificial Intelligence

    Preparing for Artificial Intelligence

    What Should You Know About Prepare.Ai? We recently announced our first annual conference at Prepare.

    1 条评论

社区洞察

其他会员也浏览了