登录查看更多内容

RAG in generative AI

Rima Modak

Data Science Engineer | Python and C++ | AI/ML, Cloud and IT | Certified in CCNA, Azure, Oracle | GHC 24

发布日期: 2024年3月26日

Pre-trained LLMs may not perform optimally for specific business needs. Potential limitations of pre-trained LLMs and the need to decide between model fine-tuning, gave rise to Retrieval-Augmented Generation as an alternative approach to enhance performance.

Model fine-tuning: a process where a pre-trained model is further trained on a new dataset without starting from scratch.

Potential applications of RAG, such as developing Q&A chatbots that securely interact with internal knowledge bases or enterprise data sources. RAG can be particularly useful in developing applications that require secure interaction with internal knowledge bases or enterprise data sources; and it is more suitable approach compared to out-of-the-box LLMs for certain applications.

Retrieval-Augmented Generation (RAG) is a technique that helps to retrieve data from outside a foundation model and augment your prompts. These prompts are natural language texts that request the Language Learning Model (LLM) to perform a specific task.

The key components of RAG are:

1. Retrieval - Relevant content is retrieved from external knowledge bases or other data sources based on the specifics of the user query.

2. Augmentation - The retrieved contextual information is then appended to the original user query, creating an augmented query to serve as the input to the foundation model.

3. Generation - The foundation model then generates a response based on the augmented query.

Thus, this approach helps to enhance the performance of pre-trained LLMs, which may not perform optimally for specific business needs out-of-the-box.

Check Image 1 and 2.

Embedding and its Relevance to RAG

Embedding refers to transforming data (text, images, audio) into numerical representation in a high-dimensional vector space using machine learning algorithms.

Embedding allows for:

Michael Spencer 1 年前

Human vs. Machine: The Future of Data Annotation

Objectways 1 个月前

Deploy a Digital Assistant today with RAG on IBM…

Gerard Suren Saverimuthu 4 个月前

- Understanding semantics

- Learning complex patterns

- Using the vector representation for applications like search, classification, and natural language processing

End-to-End RAG Architecture

- Extracting data from various sources (e.g., documents, PDFs, HTML) and converting it into a numerical representation (embeddings).

- The embedded data is then used to build a semantic index, which is stored in a knowledge base (e.g., vector database, graph database, SQL database).

- Obtaining relevant information from the knowledge base based on the user's query. (Retrieval)

- The user's query is converted into a vector representation, and a semantic search is performed to find the most relevant information.

- Enhancing the retrieved information using an LLM, which combines the retrieved information with the user's query to generate additional relevant content.(Augmentation)

- The final step, the LLM uses the augmented information to generate a coherent and informative response to the user's query.Similarity Measures

References:

https://huggingface.co/docs/transformers/en/model_doc/rag

https://www.dhirubhai.net/pulse/rag-architecture-deep-dive-frank-denneman-4lple/

https://aws.amazon.com/what-is/retrieval-augmented-generation/

Woodley B. Preucil, CFA

Senior Managing Director

6 个月

Rima Modak Very interesting. Thank you for sharing

查看更多评论

要查看或添加评论，请登录

Rima Modak的更多文章

What is AI?

2024年4月16日

What is AI?

AI's are machines that can perform learning, problem-solving and decision making. AI, ML and Deep Learning: Figure 1:…

RAG in generative AI

Rima Modak

Data Science Engineer | Python and C++ | AI/ML, Cloud and IT | Certified in CCNA, Azure, Oracle | GHC 24

领英推荐

Rima Modak的更多文章

社区洞察

其他会员也浏览了

Deploy a Digital Assistant today with RAG on IBM Power10

Fine-Tuning Florence-2 Base Model on a Custom Dataset for Image Captioning

Why enterprise data management is the relevant basis for machine learning

Deciphering Data with GPT-4

OptiFlow AI: In-Depth Tutorial on Building a Business Process Optimization Bot

If automation handles automation, what should we focus on?

The Hidden Language of AI: A Deep Dive into Embeddings

The beauty and the beast of using AI Engineering [Part 1]

Generative AI in Tourism. How to train an LLM (part 2) - Data preprocessing for LLMs

Would OpenAI’s GPT-3 ever work for a direct enterprise use?

领英推荐

Rima Modak的更多文章

What is AI?

社区洞察

其他会员也浏览了

Deploy a Digital Assistant today with RAG on IBM Power10

Fine-Tuning Florence-2 Base Model on a Custom Dataset for Image Captioning

Why enterprise data management is the relevant basis for machine learning

Deciphering Data with GPT-4

OptiFlow AI: In-Depth Tutorial on Building a Business Process Optimization Bot

If automation handles automation, what should we focus on?

The Hidden Language of AI: A Deep Dive into Embeddings

The beauty and the beast of using AI Engineering [Part 1]

Generative AI in Tourism. How to train an LLM (part 2) - Data preprocessing for LLMs

Would OpenAI’s GPT-3 ever work for a direct enterprise use?