What is Retrieval-Augmented Generation?
Image Courtesy : Microsoft Designer

What is Retrieval-Augmented Generation?

It is a method used in AI to improve the quality and relevance of generated content. In RAG (Retrieval-Augmented Generation), there are two parts: retrieval and generation.

Imagine there is a question, "How much profit did we make in Quarter 1 in the APAC region?" If you ask this question to any LLM, you will get a generic response. Answering the first part of the question is fairly easy, but the second part requires more contextual data.So, we need to use data from the company's portal, which could include a bunch of PDF documents, webpages, etc. Essentially, we need to pull out data from the sections of the website that have specific information about the profit earned in the APAC region.

Technically, each webpage is converted into a numerical representation called an embedding. Similarly, the user's question is also turned into an embedding. The system then compares these embeddings to find which webpages' embeddings are closest to the question's embedding. The closest vectors indicate the most relevant content. For instance, if the question is about "APAC," the system might retrieve and use content from the top 5 webpages related to APAC to formulate a response. So, essentially, in the last step, we just instruct the model to use information from the company's website to answer the user's question. Now the RAG process is complete.

Sample embeddings from ChatGPT

For more bite-sized design insights, follow Design Shots. https://www.dhirubhai.net/company/designshots/

要查看或添加评论,请登录

Design Shots.ai的更多文章

社区洞察

其他会员也浏览了