登录查看更多内容

Is RAG Dead? ??

AIM Research

Strategic insights for Artificial Intelligence Industry. For Brand collaborations, write to [email protected]

发布日期: 2024年5月7日

No, RAG isn’t dead yet, and many experts believe that it’s likely to change for the better, irrespective of the long context window of LLMs.?

Many developers have been experimenting with RAG, for instance, building a RAG app with Llama-3 running locally, alongside enterprises that are coming up with developments like Rovo, a new AI-powered knowledge discovery tool unveiled by Atlassian.

RAG killers?

A few months ago, many argued that even after 1 million long context windows, RAG will still be necessary in most cases because of the cost factor.?

However, in a recent interview with The New York Times, Google DeepMind chief Demis Hassabis said the company was working on caching reference materials to make subsequent processing much cheaper.?

“We're working hard on optimisations... Once you've uploaded [the data] and it's processed, the subsequent questions and answering of those questions should be faster. We're confident we can get that down to the order of a few seconds,” said Hassabis.?

Along similar lines, OpenAI released a memory feature, which serves as a relatively small memory that can retain a handful of facts about a user, thereby improving an AI’s understanding and functionality, and eventually reducing cost.?

For instance, Meta said it spent nearly $30 billion on a million NVIDIA GPUs to train its AI models. This excludes training and inference, as acknowledged by Yann LeCun, who remarked: “It’s staggering. Isn’t it?”?

It costs less to RAG?

As per reports, the LLM training cost for the top AI models continues to surge. For instance, OpenAI’s GPT-4 is estimated to cost $78 million, while Google’s Gemini is estimated to have cost $191 million.?

So if you choose to ditch RAG and instead stuff all your documents into the LLM’s context, the LLM will need to handle one million tokens for each query.?

For example, if you use Gemini 1.5 Pro, which costs approx $7 per million tokens, you will essentially be paying this amount every time the full million tokens are utilised in a query.

The price difference is stark as the cost per call with RAG is a fraction of the $7 required for a single query in Gemini 1.5, especially for applications with frequent queries.?

Speaking of RAG, Subtl.ai, an Indian AI startup, has developed a ‘private Perplexity’ platform using light models tailored for enterprises, which operates on top of existing cloud infrastructure without internet connectivity to secure sensitive data.

The company told AIM that it started out using OpenAI solutions, moved to Mistral, and now uses Llama 3. It has five models under the hood for a seamless experience for its customers, with the second-biggest model being only 110 million parameters, making it lightweight and easy for customers to integrate.

In the coming weeks, Subtl.ai plans to release the model for enterprise for free for a month and also give a private product on the internet for people to test out.

Check out the full story here.?

Fast Company 9 个月前

TAI #115: LLM Adoption Taking Off? OpenAI API Use Up…

Towards AI 2 个月前

Why 2025 will be the Key year for OpenAI

Michael Spencer 1 个月前

?? Exciting News! ???

Join us at WiDS Bangalore 2024, where data science innovation takes centre stage! ??Calling all enthusiasts to submit groundbreaking papers on AI applications in fintech, healthcare, business analytics, and more.?

Don't miss this chance to showcase your expertise to a global audience. Submit now and be part of the conversation! #WiDS2024?

Learn more and register: [Link] ??

AI More Likely to Replace Your Toxic Manager than Workers

While everyone has been breaking a sweat over AI taking away their job, the technology has apparently zeroed-in on an unlikely target – the commanders. Surprisingly, the threat may have shifted to middle and upper management, instead of the foot soldiers of an organisation. Read more here.?

[Exclusive] Sarvam AI to release Indic voice LLMs soon

In our latest episode of Tech Talks, Vivek Raghavan, cofounder of Sarvam AI, discusses future plans of the company, its work on Large Language Models (LLMs), and the potential of the Indian AI ecosystem.

INDIA

Citing LinkedIn AI bot, Ola Krutrim's CEO Bhavish Aggarwal criticised the trend of using specific pronouns in India, which he disparagingly refers to as the 'pronoun illness’. He also shared his concerns about blindly following Western norms.
HCLTech and AWS announced a strategic partnership to develop and implement generative AI solutions across various industries to transform enterprise business strategies and enhance customer experiences.
Myntra is looking to hire data scientists to enhance AI and ML-driven solutions across various business domains, including recommendation systems and generative AI.
Mindgrove Technologies launched India's first high-performance SoC for IoT devices, Secure IoT, offering advanced features at 30% lower cost than competitors.

AI Conclave Wonders

Rakuten India, in partnership with AIM, is hosting the fourth edition of the Rakuten Product Conference (RPC) ‘24, themed ‘Innovation Reimagined: Enterprise SaaS & AI’, as a virtual event on May 21-22, focusing on Enterprise SaaS and AI for data scientists and innovators globally. Click here to join. >>

Is RAG Dead? ??

AIM Research

Strategic insights for Artificial Intelligence Industry. For Brand collaborations, write to [email protected]

领英推荐

Sector 6

5,956 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

LLM Pulse - Nov 1, 2024

The New Generation of RAG and LLM Architectures

LLaMA 3: in the search of AGI

AI/ML news summary: week 31

?? AI K-news #7

??? Apple and Nvidia target OpenAI's $100B valuation

How the New Breed of LLMs is Replacing OpenAI and the Likes

(How-to) Smaller, Faster, Cheaper. The Rise of Mixture of Experts & LLAMA2 on Microsoft Azure

Stableton Navigator | OpenAI Challenges Microsoft, Groq Takes on Nvidia, Big Tech Invests in AI

Making a Better AI to Solve the World’s Greatest Problems - How Much 2022 Training Compute (in FLOP) and Data Is Needed? – What is Quetta (Q)

领英推荐

Sector 6

5,956 位关注者

Kubernetes Failing You?

2024年11月29日

Copilot’s Worst Nightmare

2024年11月27日

AI and Quantum—a Lethal Combination

2024年11月26日

Scaling AI is Hitting a Dangerous Wall

2024年11月22日

Copilot, Copilot, Everywhere

2024年11月21日

AGI Countdown Begins ??

2024年11月19日

GCC Salary Surges, AI Redefines Sales, and AgTech Goes High-Tech

2024年11月17日

LLMs Have Hit the Wall

2024年11月15日

In Anthropic We Trust

2024年11月14日

The OpenAI o1 Gamble

2024年11月13日