Enterprises Need RAG, Not Fine-Tuning

Enterprises Need RAG, Not Fine-Tuning

RAG, or retrieval augmented generation, or what some call ‘fancier prompt engineering’, often comes up for discussion when talking about hallucinations in current LLMs.?

Some people choose to fine-tune existing LLMs on their data to make them more useful, while others just connect to an external data source, which is what RAG basically is.?

The most important reason for enterprises to RAG is to reduce hallucination and provide more accurate, relevant, and trustworthy outputs while maintaining control over the information sources.?

Fine-tuning, on the other hand, with additional data is a viable option, but it carries the risk of the model “forgetting” some of its original training data. Moreover, it is mostly useful for changing the style of the generated text, rather than getting real-time updated information.?

“99% of use cases need RAG, not fine-tuning”

That is what ML engineer and teacher Santiago said when talking about GPT-3.5 pricing. It is indeed more expensive for some companies to actually fine-tune the model than to use RAG.?

However, considering both RAG and fine-tuning, Armand Ruiz, VP of product, IBM, said that fine-tuning and RAG are complementary LLM enhancement techniques. “The answer to RAG vs fine-tuning is not an either/or choice.”

Fine-tuning adapts the model’s core knowledge for specific domains, improving performance and cost-efficiency, while RAG injects up-to-date information during inference.

Considerations for choosing between RAG and fine-tuning include dynamic vs static performance, architecture, training data, model customisation, hallucinations, accuracy, transparency, cost, and complexity.


Is data annotation dying?

Speaking of data, Jason Corso, a professor of robotics at the University of Michigan, recently claimed that data annotation is a dying field. One may assume that the wave of generative AI would make data annotation jobs more abundant. But that is the exact same reason why these jobs are slowly becoming obsolete.

Though there are companies offering data annotation services in India, such as Karya, NextWealth, Appen, Scale AI, and LabelBox, AI is able to do 99% of the data labelling by itself, that too, perfectly accurately.

As Thomas Wolf of Hugging Face said, “It’s much easier to quickly spin and iterate on a pay-by-usage API than to hire and manage annotators. With model performance strongly improving and the privacy guarantee of open models, it will be harder and harder to justify making complex annotation contracts.”?

These will be some dangerous times for data annotation companies. Click here to find out what will replace data annotation.


RAG is tricky, sometimes

Back to RAG, though everyone claims that RAG is the future (just like data annotation was the new job), it is also extremely prone to prompt injections and data leaks.

When GPT-4 Turbo was launched along with the Retrieval API, OpenAI tried to fix the hallucination problem. But with a little fancier prompt engineering, a user was able to download the original knowledge files from someone else’s GPTs, an app built with GPT Builder that essentially uses RAG.

Most believe that RAG makes more sense when trying to retrieve more information and doing keyword searches, which is true. The problem is that it does not eliminate the need for heavy computing as much as pre-training does, but remains a cheaper alternative.

This is a big security issue for this model. If you give access to your documents to the AI model, someone can “convince” it to let them download the original files.

A lot of people said that RAG would make fine-tuning obsolete. But it was the same set of people who proclaimed that the launch of LLMs with larger context windows, such as Claude-3, would make RAG obsolete. But both of those are still alive and well.

If you can ignore these flaws for dynamic knowledge control, RAG lets you tweak and expand its internal knowledge without the hassle of retraining the entire model. Building from the ground up can be a costly and time-consuming endeavour.

RAG is advancing day by day and will continue to improve, eventually becoming more beneficial for enterprises. Does your company RAG?


NEWS BYTES

  • ManageEngine told AIM that the company is planning to invest another 10 million dollars in GPU and infrastructure in the next year.
  • Google has announced the general availability of Gemini in the Gmail side panel, extending its capabilities beyond Google Docs, Sheets, Slides, and Drive.
  • Tata Electronics has signed an MoU with Synopsys to collaborate on process technology bring-up and a foundry design platform to accelerate the successful ramp of customer products in India’s first fab being built by Tata Electronics in Dholera, Gujarat.
  • Pixxel has signed the 350th contract under the iDEX program to manufacture miniaturised multi-payload satellites for the Indian Air Force.
  • Motorola and Google Cloud recently announced a new multi-year relationship to bring Google’s generative AI models to Motorola phones, including the brand-new series of Razr smartphones.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了