Retrieval-Augmented Generation
Image Courtesy of J. Kuhl

Retrieval-Augmented Generation

Good morning! On this pleasant morning I’d like to talk to you about hallucinations.

Not like it’s 1960s, though, but another kind of modern hallucinations related to the field of Artificial Intelligence. Unless you have been living under a rock this past year you know that the modern Large Language Models (LLM, which are subset of Generative AI) is the talk of the town. And one of the more discussed topics about LLM is their hallucinations.?

LLM hallucination is a phenomenon that occurs when a large language model? generates text that is factually incorrect or nonsensical. This can happen for a variety of reasons, including:

  • The LLM may be trained on a dataset that contains inaccurate or incomplete information.
  • The LLM may be asked to generate text about a topic that it does not have enough information about (not enough domain knowledge).
  • The LLM may be asked to generate text that is too complex or challenging for it to handle.

You may say, well, that’s very human, right? Right!

When it comes to humans we call similar behavior "intellectual dishonesty." Intellectual dishonesty refers to the act of deliberately presenting false or misleading information or arguments with the intention of appearing knowledgeable or avoiding being seen as ignorant or unintelligent.

It involves intentionally distorting facts or fabricating information to support one's position or to gain an advantage in a discussion or debate. Intellectual dishonesty can undermine intellectual integrity and hinder genuine learning and understanding. The difference is that LLMs would not actually do this intentionally, they would do this because they are trained the way that they will try to find or fabricate the answer no matter what.

Now, LLM hallucination can be a problem because it can lead to the spread of misinformation and the creation of unrealistic expectations about what LLMs can do (and we can see this happening a lot over the past year, since OpenAI made GPT-3 public). It is important to be aware of the potential for LLM hallucination and to take steps to mitigate it (as a user and as a developer).


Recently I was experimenting with the method called Retrieval-Augmented Generation (RAG) and it’s application for Enterprise Search to mitigate the effects of hallucination (which is extremely important in search interpretation) and this method in my mind is one of the key approaches to better LLMs, to better domain specific LLMs and fine-tuned LLMs.

So what is this method about?

RAG is a technique for improving the quality of text generation by retrieving relevant documents (or facts) from a verified and/or curated knowledge base (KB) and then using those documents to augment the generation process. RAG works by firstly generating a prompt for the text generation model. The prompt is then used to retrieve relevant documents from the KB. The documents/facts are then used to augment the generation process by providing additional information and context.

RAG has been shown to improve the quality of text generation in a variety of tasks, including summarisation, question answering, and creative writing, basically, most (if not all) of the LLM use cases.

Here is an example of how RAG can be used to improve the quality of a summary:

Prompt phase: Write a summary of the book "War and Peace".
Retrieval phase: The knowledge base retrieves the following documents:
* "War and Peace" Wikipedia article
* "War and Peace" Goodreads page
* "War and Peace" Amazon page
* "War and Peace" Britannica page
Augmentation phase: The documents are used to augment the generation process by providing additional information and context. For example, the Wikipedia and Britannica articles provide information about the plot, characters,setting of the book and some additional historical context. The Goodreads page provides information about the reviews of the book. The Amazon page provides information about the price of the book and also some additional review information which could be beneficial (or not :) ).
Generation phase: The text generation model is then used to generate a summary of the book, using the information from the documents as a guide. The generated summary is more accurate and informative than a summary that is generated without using RAG because the model is mostly used to summarise the factual data rather then generate potential hallucination where the model can mix together "War and Peace" and "White Guard", for example.

In a nutshell - RAG is a powerful technique that can be used to improve the quality of text generation. It is a promising new[ish] technology that has the potential to revolutionise the way we generate text. Check the research paper to get more details on how to use RAGs, if this short post sparked your interest.

要查看或添加评论,请登录

Vadim Zaripov的更多文章

社区洞察

其他会员也浏览了