From RAG to Riches: Unleashing the potential of AI
Let's learn to make music with Generative AI vs. pounding on a single LLM note. Afterall, there are 88 keys on a piano. We should be using them all

From RAG to Riches: Unleashing the potential of AI

If you are digging into the world of AI, and I hope you are, you’ve likely heard of RAG. You may be as excited about it as some of its early adopters (like me!).

Why all the buzz? Retrieval-augmented generation (RAG) is a game changer for GenAI . It uses external datasets during inferencing to deliver more timely insights and better-quality answers. And it practically eliminates hallucinations, sidestepping one of the common pitfalls of GenAI.

Let RAG entertain you!

I love music and the RAG acronym often conjures up ragtime music for me, with its roots in multiple musical traditions. Similar to ragtime musicians who often improvised and added their own personal flair to their performances, RAG models generate new and inspiring creative content. And just as ragtime pushed musical boundaries, RAG is doing the same with AI capabilities.

RAG combines different AI and machine learning techniques to create unique and innovative outcomes. RAG can really make AI swing!

Perhaps one of the most famous ragtime music pieces is “The Entertainer,” made iconic through the 1973 Paul Newman and Robert Redford movie The Sting (cue the piano melody in your head in your head now). The movie was good enough to garner seven Academy Awards that year and establish itself as a cinematic classic. If famous film critic Roger Ebert had been a technology watcher, I’m certain that he’d see the RAG model and give it two thumbs up.

The next (imperfect) frontier

GenAI is undoubtedly the next productivity frontier. Its ability to simplify labor-intensive tasks means people across industries and job functions can redirect their energies to more creative, innovative and strategic work.

But GenAI isn’t perfect. Current LLMs are essentially frozen in time. ChatGPT covers up to January 2022; Meta’s Llama 2 is trained to September 2022, with some data updated July 2023. So if your prompt requires more current context, you won’t get it. Information generated in the last few years basically missed the cut-off date and remains outside of LLMs’ “knowledge.” It’s like having a doctor who hasn’t seen any current medical journals or drug updates. Not ideal.

The scope of LLMs is enormous, but bigger isn’t always better. The enormity of their training means they lack domain-specific knowledge and may fall short when in-depth expertise is required.

RAG allows LLMs to tap additional data resources without the time and expense of constant model fine-tuning.

It combines the strengths of retrieval-based models -- pulling real-time, accurate information from relevant datasets -- with the ability of generative models to deliver natural, reasoned responses. It also lets you cite data sources, lessens the risk of outdated information and can nearly eliminate the occurrence of LLM hallucinations (when models create false responses).

There are significant data privacy benefits as well. Fine tuning an LLM can expose a significant amount of your data -- which then becomes a permanent feature of said model. When combined with an on-premises implementation of an open-access model like Llama2, privacy and security concerns are nearly eliminated. The model is yours and the data never leaves your control.

The magic of RAG

Like GenAI, RAG starts with a question or prompt. Instead of relying purely on the LLM’s outdated knowledge base, it starts by searching across an up-to-date and living knowledge base and narrows the results through advanced search. Only then is the data sent to an LLM via a structured prompt.

And that’s where the magic happens. Start with a better, richer prompt with retrieved context and you’ll emerge with a better, more factual result.

The recent emergence of GenAI has been a true paradigm shift for the technology industry and others. And now RAG AI takes that even further. The seamless back-and-forth between retrieval and generation means that you can fully benefit from RAG without the expense of an LLM overhaul.

As this technology matures, expect to see more personalized, in-depth and insightful knowledge being shared to the benefit of every user. RAG will power applications that allow organizations to make better, faster and more strategic decisions – putting early and smart adopters well ahead of their peers.

It’s RAG time. Let yourself be entertained.


P.S. Want some stepwise guidance for experimenting with RAG. David O'Dell wrote a great article and provided a set of resources via github that can help get you started: https://infohub.delltechnologies.com/p/using-retrieval-augmented-generation-rag-on-a-custom-pdf-dataset-with-dell-technologies/


Epo Jude

Artist (Self-employed)

3 个月

Awesome ?? Retrieval-augmented generation (RAG)

回复
Srikanth Satya

Chief Technology Officer and Chief Development Officer @ Wheels Up | Cybersecurity, Product & Engineering Ownership

10 个月

@Matt Baker - Very well written. Given that Dell has so many storage products containing huge data sets I think RAG with custom integration to use data from underlying storage will make many more models effective.

回复
Jennifer Webb

28 years of development and execution of sales and marketing strategies resulting in market growth and profitability.

1 年

Your posts are critical reading for me. So many ah-ha moments and the analogies are always spot on!

回复
Tony Mackevicius

Global Leader | Co-Founder | Advisor | Innovator in Data Monetization, Data Protection & Quantum-Resistant Cryptography | Sustainable IT Advocate

1 年

Great article and congrats on the role!

回复
Barun Pandey

Distinguished Member of Technical Staff

1 年

Congrats Matt!

回复

要查看或添加评论,请登录

社区洞察