Google Gemini 1.5: The RAG Killer?

Google Gemini 1.5: The RAG Killer?

There's a ton of noise right now about Google's new Gemini 1.5 killing RAG.

Maybe it's just clickbait.

This is nowhere near the RAG killer.


But why? I mean, the thing is massive. It boasts a 1M token context window and was tested with 10M tokens. That's nearly 8x - 80x the GPT-4 turbo context window, and the results in Google's research paper are super promising for needle-in-a-haystack searches.

Needle in a haystack test results from Google's Gemini 1.5 white paper


But for now, RAG will live on and even benefit from larger context windows.


The primary reason to maintain RAG isn't cost or speed, as these issues will be resolved quickly. There are two particularly evident reasons why RAG remains essential for developing LLM-based applications: the context window is still too small, and authentication/authorization is still necessary.


First, a 1M token context window is way too small, and even if they expand it to 10M, which is just what they have tested, that's still insufficient.

  • The entirety of the Harry Potter series is nearly 1.5M tokens.
  • All of Stephen King's works amount to roughly 11M tokens.


It's unrealistic to expect that even a midsize enterprise could encapsulate the entire context of the enterprise in the prompt. Adding code and legacy systems, especially mainframes, into that context, it simply won't fit.


Meme of Zoolander saying "What is this? A context for ants?


Second, permissions will be crucial for any commercially available RAG. Any application beyond basic prototypes will require a retrieval step that identifies the user, then authenticates and authorizes them to access the specific context within the prompt. This process can become highly sophisticated and complex, to the extent that individuals may be restricted from viewing certain parts of a document, such as redacted information in court documents.


Considering these two factors, I genuinely believe RAG is here to stay. Focusing development efforts and learning resources on information sanitization, cataloging, and search will continue to provide significant benefits for AI developers.

Brad Gardner

I help companies innovate by planning and implementing the right software and technology.

1 年

Agreed RAG is here to stay. I would think there is an efficiency and performance measure here too. If you’re relevant context isn’t 10mil tokens, you may sacrifice accuracy by overdoing it.

回复

要查看或添加评论,请登录

Austin V.的更多文章

社区洞察

其他会员也浏览了