Retrieval-Augmented Generation

Retrieval-Augmented Generation

We have started to use LLMs extensively in our daily lives, when in doubt, you go to ChatGPT and hit it with a question. The other day, I was wondering, which car is the most expensive in the world, so I asked ChatGPT and that’s what I got:

Response from ChatGPT

There are two problems in this answer, i.e.,

  1. This information might be outdated.
  2. we do not know the source of this information.

Large Language Models are trained over a corpse of data from the internet or different sources but as the model gets older, its information is also getting outdated which require regular re-training of such models.

RAG (Retrieval-Augmented Generation) is framework which help utilize power of LLM with Knowledge Banks in Knowledge intensive NLP tasks. it references an external knowledge base (which was not used in its training data) to get the facts before generating a response.

The architecture of RAG contains two major components:

1.?????? Retriever

2.?????? Generator

Retriever:

The retriever component of the RAG (Retrieval-Augmented Generation) model is in responsible for retrieving relevant material from a large?corpus or knowledge database, such as Wikipedia, or Internal Database, etc., in response to an input query.

  • Input Query: The model accepts an input query (question or prompt) as input.
  • Encoding: A neural network-based encoder converts the input query into a dense vector representation. In the instance of RAG, the encoder is built on the BERT (Bidirectional Encoder Representations from Transformers) architecture.
  • Document Index: The retriever has access to a document index, which is essentially a collection of pre-encoded documents. Each document in the index is encoded into a dense vector representation using a document encoder, which is often also based on BERT.
  • Retrieval: The encoded query is compared against the encoded representations of documents in the index using a similarity metric, typically cosine similarity. The retriever selects the top-k documents in the index that are most comparable to the query.
  • Output: The retriever returns the top k retrieved documents, along with their similarity scores.

Reference: arXiv:2005.11401v4 [

Generator:

The generator generates the next token in the sequence by using the query, retrieved documents, and any previously generated tokens as inputs. It provides additional context by concatenating the input query with the retrieved documents. During training, the generator is fine-tuned to produce the desired sequence based on the input query and retrieved documents.

At test time, the generator creates the output sequence token by token, based on the input and retrieved documents.

Reference:


For more information, please refer to this research paper: https://proceedings.neurips.cc/paper/2020/hash/6b493230205f780e1bc26945df7481e5-Abstract.html

Daisy Martha

Tech Enthusiast

10 个月

Exciting read! The integration of retrieved knowledge in Retrieval-Augmented Generation (RAG) is indeed revolutionizing natural language generation.

Joydeep Bhattacharjee

?? Are you working towards leveling up your career? DM me. Lets Discuss. ????

1 年
回复
Vivek Tak

Building AI, Breaking Limits | Software Engineer @ Société Générale | Training Models That Shape the Future

1 年

Informative.

Arjun Shenoy

UiARD, UiPath-ABA & UiPath-SAI Certified | 8x UiPath Community Forum Awardee | Senior RPA Analyst | Workato Certified Integration Developer Professional

1 年

Insightful??Thank you for sharing.

Retrieval-Augmented Generation (RAG) revolutionizes natural language generation by seamlessly integrating retrieved knowledge, enhancing contextual understanding and producing more accurate and coherent responses.

要查看或添加评论,请登录

Harsh Parashar的更多文章

社区洞察

其他会员也浏览了