ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

What is RAG and why is it dead?

Jason Kirk

Senior Solutions Architect

å‘å¸ƒæ—¥æœŸ: 2025å¹´3æœˆ11æ—¥

+ å…³æ³¨

The nets are ablaze with proclamations that â€œRAG is dead!â€ But what is RAG, and why is it dead?

Letâ€™s take the Zen approach and start with what RAG is not. RAG is not:

? A method of fine-tuning a model

? A method of teaching a system

? A method of connecting your data to a chatbot

? A specific/prescriptive way of connecting data to an LLM

I think that these misconceptions have given RAG a bad reputation and put a bad taste in the mouths of stakeholders. Stakeholders who have sometimes invested heavily in AI based on expectations gleaned from a na?ve RAG demo.

RAG means Retrieval-Augmented Generation. Generation refers to what an LLM does in response to a query. Retrieval-Augmented means that data is searched for and presented to the LLM to enrich its response.

What many seem to mean when they use the term RAG is â€œNa?ve RAG,â€ which is the simplest implementation of RAG used mostly for dabling and demonstration purposes. Na?ve RAG consists of this strategy:

? Convert your source data to text if it isnâ€™t text already.

? Break that data up into chunks (blocks of specific-sized text).

? Embed each chunk (turn the text into a tokenized semantic format).

? Store the chunk in a database with the embedded value as an indexable record.

? Intercept user prompts to the LLM.

? Embed the userâ€™s query to determine its semantic meaning.

? Search the embedded indexes for the chunks with the most similar meaning to the query.

? Retrieve those chunks and insert them into the userâ€™s query before it reaches the LLM.

? Cross your fingers and hope that a question about dogs didnâ€™t retrieve more questions about dogs instead of answers about dogs.

? Cross your other fingers that even if the right context was retrieved, the LLM will use it to augment its response as expected.

Yes, this way of doing things should be dead because itâ€™s pretty useless at scale. But all this really is, is the na?ve RAG approach. This isnâ€™t â€œrealâ€ RAG and was never meant to serve applications at scale.

RAG can be a complex application that takes many steps and validation processes before returning a truly qualified response. RAG can return data that is machine-comprehensible, not just a block of text. RAG can return complete context, not just slices of similar content. LLMs can be prompted properly to understand what action to take regarding the retrieved context. The LLMâ€™s responses can be intercepted and acted upon before being returned to the user.

Your imagination is the limit when it comes to implementing RAG beyond the na?ve approach. If knowledge is a key requirement of your system, you might even consider the perspective that RAG is your application, and the LLM is just a composition layer.

RAG isnâ€™t dead. People are just tired of it not working the way that they expect. Itâ€™s not working because so many people are doing it wrong.

This year, Iâ€™ve had a significant focus on developing machine-comprehensible retrieval mechanisms for systems that are required to produce regulatory compliant responses. These responses must be free of hallucinations and must accurately reflect the true content of governed materials.

Iâ€™ve talked to several professionals recently who indicated that AI didnâ€™t work for them, and their response to â€œI can fix that for youâ€ was just to roll their eyes.

I donâ€™t think RAG is dead, but I think, to many people, RAG is â€œdead to me.â€ Itâ€™s unfortunate that so many people are building these complex applications in the wrong way.

If your application relies on knowledge, then RAG isn't just a feature, its your application. Those who dismiss RAG due to early failures risk missing out on the true promise of AI.

Those who push past these na?ve implementations to develop robust, structured retrieval systems will lead the way in AI-driven outcomes. The future of AI isnâ€™t about replacing RAG, itâ€™s about appreciating its role and embracing its complexity.

Chris Lim

Director of Primalcom Enteprise Sdn Bhd

3 å¤©å‰

Great insights Jason Kirk

èµž

å›žå¤

1 æ¬¡å›žåº”

Gaubert Feliho

Business & Entrepreneur Automation Specialist | Advanced Bot Integration & Workflow Optimization

1 å‘¨

Clear. I appreciate this explanation. Thanks

èµž

å›žå¤

1 æ¬¡å›žåº”

æŸ¥çœ‹æ›´å¤šè¯„è®º

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Jason Kirkçš„æ›´å¤šæ–‡ç«

What I wish I had known then - Volume 1: Vendors

2019å¹´2æœˆ4æ—¥

What I wish I had known then - Volume 1: Vendors

I saw a fairly hostile debate on my LinkedIn feed this morning. A customer was publicly shaming one of their vendors.

3 æ¡è¯„è®º
It's that time again!

2017å¹´11æœˆ9æ—¥

It's that time again!

Yes, it's that time againâ€¦. New media types are hitting the market and anyone who crams one in their box first isâ€¦
All Flash FAS, The leader for Up-time and Serviceability

2015å¹´6æœˆ19æ—¥

All Flash FAS, The leader for Up-time and Serviceability

A few weeks ago I published a post on NetApp FAS being the fastest array out there with enterprise suitability. Justâ€¦
Simply the Best at Performance SAN

2015å¹´4æœˆ23æ—¥

Simply the Best at Performance SAN

The thing I love best about working at NetApp is that all I need to win business is to be honest and educate myâ€¦

Jason Kirkçš„æ›´å¤šæ–‡ç«

What I wish I had known then - Volume 1: Vendors

It's that time again!

All Flash FAS, The leader for Up-time and Serviceability

Simply the Best at Performance SAN

ç¤¾åŒºæ´žå¯Ÿ