登录查看更多内容

Enterprises Need RAG, Not Fine-Tuning

AIM Research

Strategic insights for Artificial Intelligence Industry. For Brand collaborations, write to [email protected]

发布日期: 2024年6月26日

RAG, or retrieval augmented generation, or what some call ‘fancier prompt engineering’, often comes up for discussion when talking about hallucinations in current LLMs.?

Some people choose to fine-tune existing LLMs on their data to make them more useful, while others just connect to an external data source, which is what RAG basically is.?

The most important reason for enterprises to RAG is to reduce hallucination and provide more accurate, relevant, and trustworthy outputs while maintaining control over the information sources.?

Fine-tuning, on the other hand, with additional data is a viable option, but it carries the risk of the model “forgetting” some of its original training data. Moreover, it is mostly useful for changing the style of the generated text, rather than getting real-time updated information.?

“99% of use cases need RAG, not fine-tuning”

That is what ML engineer and teacher Santiago said when talking about GPT-3.5 pricing. It is indeed more expensive for some companies to actually fine-tune the model than to use RAG.?

However, considering both RAG and fine-tuning, Armand Ruiz, VP of product, IBM, said that fine-tuning and RAG are complementary LLM enhancement techniques. “The answer to RAG vs fine-tuning is not an either/or choice.”

Fine-tuning adapts the model’s core knowledge for specific domains, improving performance and cost-efficiency, while RAG injects up-to-date information during inference.

Considerations for choosing between RAG and fine-tuning include dynamic vs static performance, architecture, training data, model customisation, hallucinations, accuracy, transparency, cost, and complexity.

Is data annotation dying?

Speaking of data, Jason Corso, a professor of robotics at the University of Michigan, recently claimed that data annotation is a dying field. One may assume that the wave of generative AI would make data annotation jobs more abundant. But that is the exact same reason why these jobs are slowly becoming obsolete.

Though there are companies offering data annotation services in India, such as Karya, NextWealth, Appen, Scale AI, and LabelBox, AI is able to do 99% of the data labelling by itself, that too, perfectly accurately.

As Thomas Wolf of Hugging Face said, “It’s much easier to quickly spin and iterate on a pay-by-usage API than to hire and manage annotators. With model performance strongly improving and the privacy guarantee of open models, it will be harder and harder to justify making complex annotation contracts.”?

Pratibha Kumari J. 2 周前

Master Advanced Prompt Engineering to Leverage…

Data Science Dojo 10 个月前

#183 Are Lakehouses Ready for AI Guests?

Rishi Yadav 3 个月前

These will be some dangerous times for data annotation companies. Click here to find out what will replace data annotation.

RAG is tricky, sometimes

Back to RAG, though everyone claims that RAG is the future (just like data annotation was the new job), it is also extremely prone to prompt injections and data leaks.

When GPT-4 Turbo was launched along with the Retrieval API, OpenAI tried to fix the hallucination problem. But with a little fancier prompt engineering, a user was able to download the original knowledge files from someone else’s GPTs, an app built with GPT Builder that essentially uses RAG.

Most believe that RAG makes more sense when trying to retrieve more information and doing keyword searches, which is true. The problem is that it does not eliminate the need for heavy computing as much as pre-training does, but remains a cheaper alternative.

This is a big security issue for this model. If you give access to your documents to the AI model, someone can “convince” it to let them download the original files.

A lot of people said that RAG would make fine-tuning obsolete. But it was the same set of people who proclaimed that the launch of LLMs with larger context windows, such as Claude-3, would make RAG obsolete. But both of those are still alive and well.

If you can ignore these flaws for dynamic knowledge control, RAG lets you tweak and expand its internal knowledge without the hassle of retraining the entire model. Building from the ground up can be a costly and time-consuming endeavour.

RAG is advancing day by day and will continue to improve, eventually becoming more beneficial for enterprises. Does your company RAG?

NEWS BYTES

ManageEngine told AIM that the company is planning to invest another 10 million dollars in GPU and infrastructure in the next year.
Google has announced the general availability of Gemini in the Gmail side panel, extending its capabilities beyond Google Docs, Sheets, Slides, and Drive.
Tata Electronics has signed an MoU with Synopsys to collaborate on process technology bring-up and a foundry design platform to accelerate the successful ramp of customer products in India’s first fab being built by Tata Electronics in Dholera, Gujarat.
Pixxel has signed the 350th contract under the iDEX program to manufacture miniaturised multi-payload satellites for the Indian Air Force.
Motorola and Google Cloud recently announced a new multi-year relationship to bring Google’s generative AI models to Motorola phones, including the brand-new series of Razr smartphones.

Enterprises Need RAG, Not Fine-Tuning

AIM Research

Strategic insights for Artificial Intelligence Industry. For Brand collaborations, write to [email protected]

领英推荐

Sector 6

5,607 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

What's the next big thing in data preparation for computer vision AI?

Generative AI Model Development: The Full Stack Approach

Gen AI Series: Data Foundations concepts for Enterprise Gen AI Solutions

Elevating Generative AI from Pilot to Production: A Blueprint for Success

Data Annotation Tools Market is set for a Potential Growth Worldwide: Excellent Technology Trends with Business Analysis

DATA LABELER, WHO IS THAT???!

Beyond the Bounding Box in an Image

Why Synthetic Data is Essential for Successful Machine Learning Models

Victoria Edelman Newsletter #2

How to Start a Low-Cost, Secure Generative AI Pilot Using RAG

领英推荐

Sector 6

5,607 位关注者

Rehiring Laid Off Developers ??

2024年10月8日

Generative AI Heats Up: $2.7B Moves, Women's Influence in VC, and More

2024年10月6日

OpenAI Brings Claude Artifacts and Cursor to ChatGPT

2024年10月5日

OpenAI Raises $6.6 Bn, Closing in on Google-Level Valuation

2024年10月3日

OpenAI is Building the iPhone of AI

2024年10月1日

GitHub Copilot Extensions are Redefining DevOps

2024年9月20日

New Trends in AI Are Here—Shocking Deals and Breakthroughs Set to Dominate 2024

2024年9月10日

AI Fundings and Workplace Shifts Are Changing the Game

2024年9月3日

All the Non-NVIDIA GPUs, Please Stand Up

2024年8月29日

From Data Science Service Providers to India’s $33B GCC Impact: Get the Scoop on 2024’s Biggest Industry Shifts!

2024年8月27日