登录查看更多内容

点击“继续加入或登录”，即表示您同意遵守领英的《用户协议》、《隐私政策》及《Cookie 政策》。

Build Your Business-specific LLMs Using RAG

Kashif Manzoor

Enabling Customers for a Successful AI Adoption | AI Tech Evangelist | AI Solutions Architect

发布日期: 2024年3月2日

When we talk about Large Language model implementations in the business context, you will hear the widespread term Retrieval-Augmented Generation (RAG), and it is being presented as the Magic wand to several scenarios where you need to rely on your data while using the generative AI. RAG is the solution for assembling your business data and the LLM; you will get the desired outputs.

So, I thought of going through the fundamentals of RAG; it is just for understanding and clarity. In a paper in 2020, "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks," Meta introduced a retrieval-augmented generation framework to give LLMs access to information beyond their training data. RAG allows LLMs to build on a specialized body of knowledge to answer questions more accurately.

Retrieval-augmented generation (RAG) in Large Language Models (LLMs) enhances the model’s ability to generate responses by dynamically retrieving relevant information from a large dataset or database at the time of the query. This approach combines the generative power of LLMs with the specificity and accuracy provided by external data sources, enabling the model to produce more accurate, detailed, and contextually relevant outputs.

How RAG Works:

Query Processing: When a query or prompt is received, the RAG system interprets the request.
Data Retrieval: It then searches a connected database or knowledge base (PDFs, text documents, code repositories, etc) to find relevant information related to the query.
Content Generation: The retrieved information is fed into the LLM, which uses this context to generate a more informed and accurate response.

Example:

Suppose you are using a RAG-enhanced LLM for a medical information system. A user asks, “What are the latest treatment options for type 2 diabetes?”

Interpretation: The RAG system interprets the query to understand that it needs information on recent diabetes treatments.
Retrieval: It queries the connected medical database or sources of medical information stored in its knowledge base, retrieving articles, studies, and guidelines related to the latest treatment options for type 2 diabetes.
Generation: The LLM, now equipped with the latest retrieved information, generates a response summarizing the current treatment options, perhaps mentioning new drugs, lifestyle modification strategies, and the latest findings from recent studies.

Without RAG, an LLM would have to rely solely on the information it was trained on, which might be outdated or lack the specific details in newly published research. RAG ensures the model’s output is current and deeply informed by the most relevant available data, significantly enhancing the quality and utility of the response.

What are the use cases for RAG (Retrieval-Augmented Generation)?:

Question-Answering Chatbots: By integrating large language models (LLMs) with chatbots, they can autonomously generate more precise answers by accessing company documents and knowledge bases. This approach is primarily utilized to enhance customer support, automate website responses, and add business context and data for providing quick solutions to inquiries and resolving issues efficiently.
Enhanced Search Capabilities: When combined with search engines, LLMs can enrich search outcomes with generated responses, improving the accuracy of informational queries. This advancement makes it simpler for users to locate the necessary information for their tasks.
Data Query Engines: Utilizing company data as a context for LLMs enables employees to obtain answers to their queries effortlessly. This application is handy for accessing information from HR, Finance, Procurement, and Legal to several divisions documents, such as questions about company policies, benefits, and compliance standards.

These use cases demonstrate the versatility and potential of RAG to transform information retrieval and interaction within organizations. In the next week, I will go through the technical aspects of the RAG and how it works.

If you want this newsletter through email, subscribe here at AI Tech Circle.

Weekly News & Updates...

This week's unveiling of new AI tools and products drives the technology revolution forward.

Aya open-source LLM from Cohere multilingual model is available on Kaggle, so go to Kaggle and start exploring.
Gemma from Google Open Language Models is now available in the KerasNLP collection.
Gemini Business from Google will be available in the Google Workspace apps
The EU’s AI Act and How Companies Can Achieve Compliance

The Cloud: the backbone of the AI revolution

Building Open Models Responsibly in the Gemini Era
Comprehensive tactics for optimizing large language models for your application
Streamline diarization using AI as an assistive technology: ZOO Digital’s story
Artistry With Adobe: Creator Esteban Toro Delivers Inspirational Master Class Powered by AI and RTX

Favorite Tip Of The Week:

Here's my favorite resource of the week.

Responsible Generative AI Toolkit: This toolkit provides resources to apply best practices for the responsible use of open models

Potential of AI

Experiment: Figma to Replit Plugin: This experimental plugin turns static designs into responsive React components. Export the generated code to Replit to share an instantly-deployable React app.

Things to Know

Stable Diffusion 3 has released an early preview of the model with the capabilities of the text-to-image model with significantly improved performance in multi-subject prompts, image quality, and spelling abilities.

The Opportunity...

Podcast:

This week's Open Tech Talks episode 126 is "Web3 Unveiled: Revolutionizing Digital Engagement with Viktoriia Miracle"

Apple | Spotify | Google Podcast

Courses to attend:

Let's build the GPT Tokenizer, a new video from Andrej Karpathy
MIT 6.S192: Deep Learning for Art, Aesthetics, and Creativity

Events:

Nvidia GTC AI Conference and Expo, March 18–21, San Jose, CA and Virtual
KubeCon + CloudNativeCon Europe March 19-22 | Paris, France
GISEC, Global, 23-24 April, Dubai, UAE

Tech and Tools...

Gemma in PyTorch: PyTorch implementation of Gemma models
SoraWebui is an open-source project that simplifies video creation by allowing users to generate videos online with OpenAI's Sora model using text
ChatGPT + Enterprise data with Azure OpenAI and AI Search

Data Sets...

fastMRI Dataset from NYU School of Medicine and NYU Langone Health
ROSE: A Retinal OCT-Angiography Vessel SEgmentation Dataset

Other Technology News

Want to stay on the cutting edge?

Here's what else is happening in Information Technology that you should know about:

Cyberattacks are the No. 1 worry for business leaders - and AI may be able to help, as reported by Fortune

Earlier Edition of a newsletter

GenAI at the Peak of the Hype Cycle
Introduction to Generative AI for Newbies
Generative AI Tech Stack Evolution
Generative AI use cases Text Summarization
Generative AI for the Retail industry
How to validate your first Generative AI Use Case
5 approaches for Deploying Generative AI
Large Language Models
The rise of Generative AI
100+ Generative AI Use Cases Across Industries
Large Language Models (LLMs) Use Cases for the Property Real Estate Industry

That's it!

As always, thanks for reading.

Hit reply and let me know what you found most helpful this week - I'd love to hear from you!

Until next week,

Kashif Manzoor

The opinions expressed here are solely my conjecture based on experience, practice, and observation. They do not represent the thoughts, intentions, plans, or strategies of my current or previous employers or their clients/customers. The objective of this newsletter is to share and learn with the community.

AI Tech Circle

1,728 位关注者

Dev Aditya

AI in Education and Learning Expert, Creator of the world's first publicly available AI teacher, Upskilled 47,000 learners globally, Multiple Award recipient including from the Prime Minister of UK and 30 under 30 (Mint)

1 年

RAG is efficient. Data igestion and prompt curation are key to a good output.

1 次回应

Godwin Josh

Co-Founder of Altrosyn and DIrector at CDTECH | Inventor | Manufacturer

1 年

RAG's approach to building customized AI models aligns with the ongoing trend of tailoring large language models (LLMs) for specific business needs. This customization, tapping into real-time data, reflects the growing emphasis on precision and efficiency in AI applications. Looking back, historical data often showcases the evolution of AI customization, but how do you see RAG's methodology addressing potential challenges such as ethical considerations and bias in real-time data integration? Exploring these facets could provide valuable insights into refining AI models for responsible and inclusive deployment.

1 次回应

查看更多评论

要查看或添加评论，请登录

Kashif Manzoor的更多文章

Generative AI Engagement Pyramid for growing AI Inference

2025年3月16日

Generative AI Engagement Pyramid for growing AI Inference

Welcome to your weekly AI Newsletter from AITechCircle! I'm building and implementing AI solutions and sharing…
Is the AI race Your organization's AI Race?

2025年3月2日

Is the AI race Your organization's AI Race?

Welcome to your weekly AI Newsletter from AITechCircle! I'm building and implementing AI solutions and sharing…

2 条评论
Agentic AI: Creating your first AI Agent in OCI

2025年2月16日

Agentic AI: Creating your first AI Agent in OCI

Welcome to your weekly AI Newsletter from AITechCircle! I'm building and implementing AI solutions and sharing…

5 条评论
What AI Can Do, Its Risks, and Mitigation Strategies from the International AI Safety Report

2025年2月9日

What AI Can Do, Its Risks, and Mitigation Strategies from the International AI Safety Report

Welcome to your weekly AI Newsletter from AITechCircle! I'm building and implementing AI solutions and sharing…
DeepSeek Special Edition: Action Steps for the Week

2025年2月2日

DeepSeek Special Edition: Action Steps for the Week

Welcome to your weekly AI Newsletter from AITechCircle! I'm Building, Implementing AI solutions, and sharing everything…
Race to Deliver Generative AI Outcomes

2025年1月26日

Race to Deliver Generative AI Outcomes

Welcome to your weekly AI Newsletter from AITechCircle! I'm Building, Implementing AI solutions, and sharing everything…
Introduction to 4 Agentic AI Design Patterns

2025年1月19日

Introduction to 4 Agentic AI Design Patterns

Welcome to your weekly AI Newsletter from AITechCircle! I'm Building, Implementing AI solutions, and sharing everything…
From Perception AI to Generative to Agentic to Physical AI

2025年1月12日

From Perception AI to Generative to Agentic to Physical AI

Welcome to your weekly AI Newsletter from AITechCircle! I'm Building, Implementing AI solutions, and sharing everything…
Generative AI: Use Case-Driven Approach vs Capability-Driven Approach

2025年1月4日

Generative AI: Use Case-Driven Approach vs Capability-Driven Approach

Welcome to your weekly AI Newsletter from AITechCircle! I'm Building, Implementing AI solutions, and sharing everything…
2024 Wrapping up a Year of Shared Learning and Growth in AI

2024年12月28日

2024 Wrapping up a Year of Shared Learning and Growth in AI

Welcome to your weekly AI Newsletter from AITechCircle! I'm Building, Implementing AI solutions, and sharing everything…

1 条评论

See all articles

Weekly News & Updates...

The Cloud: the backbone of the AI revolution

Favorite Tip Of The Week:

Potential of AI

Things to Know

The Opportunity...

Tech and Tools...

Data Sets...

Other Technology News

Earlier Edition of a newsletter

That's it!

AI Tech Circle

1,728 位关注者

Kashif Manzoor的更多文章

Generative AI Engagement Pyramid for growing AI Inference

Is the AI race Your organization's AI Race?

Agentic AI: Creating your first AI Agent in OCI

What AI Can Do, Its Risks, and Mitigation Strategies from the International AI Safety Report

DeepSeek Special Edition: Action Steps for the Week

Race to Deliver Generative AI Outcomes

Introduction to 4 Agentic AI Design Patterns

From Perception AI to Generative to Agentic to Physical AI

Generative AI: Use Case-Driven Approach vs Capability-Driven Approach

2024 Wrapping up a Year of Shared Learning and Growth in AI

社区洞察