登录查看更多内容

Google Gemini 1.5: The RAG Killer?

Austin V.

Dad x3 / Founder / Investor / Skier / Drinker Of Malort / Writer of Code

发布日期: 2024年2月19日

+ 关注

There's a ton of noise right now about Google's new Gemini 1.5 killing RAG.

Maybe it's just clickbait.

This is nowhere near the RAG killer.

But why? I mean, the thing is massive. It boasts a 1M token context window and was tested with 10M tokens. That's nearly 8x - 80x the GPT-4 turbo context window, and the results in Google's research paper are super promising for needle-in-a-haystack searches.

Needle in a haystack test results from Google's Gemini 1.5 white paper

But for now, RAG will live on and even benefit from larger context windows.

The primary reason to maintain RAG isn't cost or speed, as these issues will be resolved quickly. There are two particularly evident reasons why RAG remains essential for developing LLM-based applications: the context window is still too small, and authentication/authorization is still necessary.

领英推荐

The age of AI transformation

Satya Nadella 9 个月前

?? DeepSeek Live Updates: OpenAI and Microsoft…

TradeZero 1 个月前

Google I/O Wasn't Boring, it was All about Future…

Michael Spencer 9 个月前

First, a 1M token context window is way too small, and even if they expand it to 10M, which is just what they have tested, that's still insufficient.

The entirety of the Harry Potter series is nearly 1.5M tokens.
All of Stephen King's works amount to roughly 11M tokens.

It's unrealistic to expect that even a midsize enterprise could encapsulate the entire context of the enterprise in the prompt. Adding code and legacy systems, especially mainframes, into that context, it simply won't fit.

Meme of Zoolander saying "What is this? A context for ants?

Second, permissions will be crucial for any commercially available RAG. Any application beyond basic prototypes will require a retrieval step that identifies the user, then authenticates and authorizes them to access the specific context within the prompt. This process can become highly sophisticated and complex, to the extent that individuals may be restricted from viewing certain parts of a document, such as redacted information in court documents.

Considering these two factors, I genuinely believe RAG is here to stay. Focusing development efforts and learning resources on information sanitization, cataloging, and search will continue to provide significant benefits for AI developers.

Brad Gardner

I help companies innovate by planning and implementing the right software and technology.

1 年

Agreed RAG is here to stay. I would think there is an efficiency and performance measure here too. If you’re relevant context isn’t 10mil tokens, you may sacrifice accuracy by overdoing it.

查看更多评论

要查看或添加评论，请登录

Austin V.的更多文章

Why Big Classes Get Bigger: Understanding Preferential Attachment in Your Code

2024年12月19日

Why Big Classes Get Bigger: Understanding Preferential Attachment in Your Code

Software often models the natural world. One pattern I have been thinking about is why our biggest components continue…

5 条评论
Craft, Code, and the tokenization of our trade

2024年12月3日

Craft, Code, and the tokenization of our trade

"We don't have factories that stamp out software. Rather we have groups of people trying to massage software into place.

4 条评论
The Token Economy: The Future of Computing is Tokens

2024年9月17日

The Token Economy: The Future of Computing is Tokens

The Value is in Tokens Current market conditions present a paradox to investors, undercurrents of a recession are met…
The Death Of Agile.

2024年9月4日

The Death Of Agile.

Early this summer, I had the opportunity to attend and speak at Agile Denver. While I was there, something felt off.

12 条评论
The Electricity of AI: Rethinking Software Development for the AI Era

2024年3月20日

The Electricity of AI: Rethinking Software Development for the AI Era

Professional software development is on the cusp of a massive paradigm shift, driven by the proliferation of artificial…
Good Company: Building It, Finding It, Being It

2024年3月14日

Good Company: Building It, Finding It, Being It

When was the last time you experienced truly good company? Good Company is a philosophy that transcends everything I…

5 条评论

See all articles

Google Gemini 1.5: The RAG Killer?

Austin V.

Dad x3 / Founder / Investor / Skier / Drinker Of Malort / Writer of Code

领英推荐

Austin V.的更多文章

社区洞察

其他会员也浏览了

Star History Monthly Pick | Llama 2 and Ecosystem Edition

Is Google Done For?

DataPanthy #55

This Is How Google Gets Commoditized

Are We Ready For An AGI Future?

Amazing Excerpts from Paige Bailey, Google, on GenAI (ETLS Vegas 2024)

DeepSeek-R1 for Analyzing SEC Filings

How will Aleo stimulate a breakthrough for web 3.0 in 2023?

DeepSeek's R-1: A Game-Changer for Financial Services and Big Tech

领英推荐

Austin V.的更多文章

Why Big Classes Get Bigger: Understanding Preferential Attachment in Your Code

Craft, Code, and the tokenization of our trade

The Token Economy: The Future of Computing is Tokens

The Death Of Agile.

The Electricity of AI: Rethinking Software Development for the AI Era

Good Company: Building It, Finding It, Being It

社区洞察

其他会员也浏览了

Star History Monthly Pick | Llama 2 and Ecosystem Edition

Is Google Done For?

DataPanthy #55

This Is How Google Gets Commoditized

Are We Ready For An AGI Future?

Amazing Excerpts from Paige Bailey, Google, on GenAI (ETLS Vegas 2024)

DeepSeek-R1 for Analyzing SEC Filings

How will Aleo stimulate a breakthrough for web 3.0 in 2023?

DeepSeek's R-1: A Game-Changer for Financial Services and Big Tech