What is RAG and why is it dead?

What is RAG and why is it dead?

The nets are ablaze with proclamations that “RAG is dead!” But what is RAG, and why is it dead?

Let’s take the Zen approach and start with what RAG is not. RAG is not:

? A method of fine-tuning a model

? A method of teaching a system

? A method of connecting your data to a chatbot

? A specific/prescriptive way of connecting data to an LLM


I think that these misconceptions have given RAG a bad reputation and put a bad taste in the mouths of stakeholders. Stakeholders who have sometimes invested heavily in AI based on expectations gleaned from a na?ve RAG demo.

RAG means Retrieval-Augmented Generation. Generation refers to what an LLM does in response to a query. Retrieval-Augmented means that data is searched for and presented to the LLM to enrich its response.

What many seem to mean when they use the term RAG is “Na?ve RAG,” which is the simplest implementation of RAG used mostly for dabling and demonstration purposes. Na?ve RAG consists of this strategy:

? Convert your source data to text if it isn’t text already.

? Break that data up into chunks (blocks of specific-sized text).

? Embed each chunk (turn the text into a tokenized semantic format).

? Store the chunk in a database with the embedded value as an indexable record.

? Intercept user prompts to the LLM.

? Embed the user’s query to determine its semantic meaning.

? Search the embedded indexes for the chunks with the most similar meaning to the query.

? Retrieve those chunks and insert them into the user’s query before it reaches the LLM.

? Cross your fingers and hope that a question about dogs didn’t retrieve more questions about dogs instead of answers about dogs.

? Cross your other fingers that even if the right context was retrieved, the LLM will use it to augment its response as expected.


Yes, this way of doing things should be dead because it’s pretty useless at scale. But all this really is, is the na?ve RAG approach. This isn’t “real” RAG and was never meant to serve applications at scale.

RAG can be a complex application that takes many steps and validation processes before returning a truly qualified response. RAG can return data that is machine-comprehensible, not just a block of text. RAG can return complete context, not just slices of similar content. LLMs can be prompted properly to understand what action to take regarding the retrieved context. The LLM’s responses can be intercepted and acted upon before being returned to the user.

Your imagination is the limit when it comes to implementing RAG beyond the na?ve approach. If knowledge is a key requirement of your system, you might even consider the perspective that RAG is your application, and the LLM is just a composition layer.

RAG isn’t dead. People are just tired of it not working the way that they expect. It’s not working because so many people are doing it wrong.

This year, I’ve had a significant focus on developing machine-comprehensible retrieval mechanisms for systems that are required to produce regulatory compliant responses. These responses must be free of hallucinations and must accurately reflect the true content of governed materials.

I’ve talked to several professionals recently who indicated that AI didn’t work for them, and their response to “I can fix that for you” was just to roll their eyes.

I don’t think RAG is dead, but I think, to many people, RAG is “dead to me.” It’s unfortunate that so many people are building these complex applications in the wrong way.

If your application relies on knowledge, then RAG isn't just a feature, its your application. Those who dismiss RAG due to early failures risk missing out on the true promise of AI.

Those who push past these na?ve implementations to develop robust, structured retrieval systems will lead the way in AI-driven outcomes. The future of AI isn’t about replacing RAG, it’s about appreciating its role and embracing its complexity.


Chris Lim

Director of Primalcom Enteprise Sdn Bhd

3 天前

Great insights Jason Kirk

Gaubert Feliho

Business & Entrepreneur Automation Specialist | Advanced Bot Integration & Workflow Optimization

1 周

Clear. I appreciate this explanation. Thanks

要查看或添加评论,请登录

Jason Kirk的更多文章

  • What I wish I had known then - Volume 1: Vendors

    What I wish I had known then - Volume 1: Vendors

    I saw a fairly hostile debate on my LinkedIn feed this morning. A customer was publicly shaming one of their vendors.

    3 条评论
  • It's that time again!

    It's that time again!

    Yes, it's that time again…. New media types are hitting the market and anyone who crams one in their box first is…

  • All Flash FAS, The leader for Up-time and Serviceability

    All Flash FAS, The leader for Up-time and Serviceability

    A few weeks ago I published a post on NetApp FAS being the fastest array out there with enterprise suitability. Just…

  • Simply the Best at Performance SAN

    Simply the Best at Performance SAN

    The thing I love best about working at NetApp is that all I need to win business is to be honest and educate my…

社区洞察