登录查看更多内容

?? Moving beyond RAG

Pascal Biese

Daily AI highlights for 60k+ experts ???? AI/ML Engineer

发布日期: 2024年3月22日

+ 关注

In this issue:

2.9x Lower Latency with Prompt Compression
Unified Structure Learning
Is it RAG? Is it FT? No, it’s RAFT!

Meet your new AI-powered data analyst!

Telescope Labs makes quality insights and Data Science more accessible by simplifying the "data to action" journey for everyone.

Want to empower your teams to develop better products and services with the help of AI? Click on the button below and try it out for free.

1. LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

Watching: LLMLingua-2 (paper)

What problem does it solve? Prompts are a crucial component in interacting with Large Language Models (LLMs). However, as prompts become more complex and detailed to guide the model effectively, they also become longer. This increased length can lead to redundancy and inefficiency in the prompts. Existing approaches to compress prompts often rely on information entropy obtained from a causal language model, but this method has limitations in capturing all essential information and aligning with the prompt compression objective.

How does it solve the problem? The proposed approach addresses the limitations of existing prompt compression methods by introducing a data distillation procedure. This procedure derives knowledge from an LLM to compress prompts without losing crucial information. Additionally, the authors introduce an extractive text compression dataset to support the compression task. By formulating prompt compression as a token classification problem and using a Transformer encoder architecture, the model captures essential information from the full bidirectional context, ensuring the faithfulness of the compressed prompt to the original one.

What's next? As prompt-based interaction with LLMs becomes increasingly prevalent, efficient and effective prompt compression techniques will be essential for maintaining performance while minimizing computational costs. Further research could explore the application of this approach to a wider range of tasks and LLMs, as well as investigating the potential for integrating prompt compression into the LLM training process itself.

领英推荐

TAPE: LLM Explanations as GNN Features

Russell Jurney 1 年前

RAG || !2 RAG

Sanjay Basu PhD 1 年前

Snowflake Cortex: Bringing GenAI Capabilities to Data…

Fidel .V 7 个月前

2. mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

Watching: DocOwl 1.5 (paper/code)

What problem does it solve? Multimodal Large Language Models (MLLMs) have shown impressive capabilities in understanding and reasoning about visual documents like forms, receipts, charts, and webpages. However, current MLLMs often struggle with fully capturing the rich structural information present in these documents. Understanding the layout, spatial relationships, and hierarchical organization of elements is crucial for accurately interpreting the semantics of text-rich images.

How does it solve the problem? The researchers propose Unified Structure Learning, which combines structure-aware parsing tasks and multi-grained text localization tasks across various domains. They introduce H-Reducer, a vision-to-text module that preserves layout information while efficiently reducing the length of visual features. This enables the LLM to process high-resolution images more effectively. Additionally, they construct DocStruct4M, a comprehensive training set with structure-aware text sequences and multi-grained text-bounding box pairs, and DocReason25K, a high-quality reasoning tuning dataset for detailed explanations in the document domain.

What's next? The proposed DocOwl 1.5 model achieves state-of-the-art performance on 10 visual document understanding benchmarks, significantly outperforming previous MLLMs with a 7B LLM. This demonstrates the importance of incorporating structure learning in MLLMs for text-rich image understanding. Future research could explore extending this approach to other domains, such as scientific literature, medical records, or legal documents, where structure plays a vital role in comprehension. Additionally, investigating more efficient architectures and training strategies for structure-aware MLLMs could further enhance their practicality and scalability.

3. RAFT: Adapting Language Model to Domain Specific RAG

Watching: RAFT (paper)

What problem does it solve? Large Language Models (LLMs) are typically pretrained on vast amounts of general-domain data. However, when applying these models to specific domains or tasks, it is often necessary to incorporate additional knowledge that is not present in the pretraining data. This can be achieved through techniques like Retrieval-Augmented Generation (RAG) or fine-tuning. The challenge lies in finding the most effective way to integrate this new knowledge into the pretrained model to improve its performance on the target task.

How does it solve the problem? Retrieval Augmented FineTuning (RAFT) is a training approach that enhances the model's ability to answer questions in an "open-book" in-domain setting. Given a question and a set of retrieved documents, RAFT trains the model to disregard documents that are not relevant to answering the question, referred to as "distractor documents." It achieves this by explicitly citing the correct sequence from the relevant document that would assist in answering the question. Additionally, RAFT employs a chain-of-thought-style response, which helps improve the model's reasoning capabilities.

What's next? The effectiveness of RAFT in improving the performance of pretrained LLMs in domain-specific RAG tasks has been consistently demonstrated across various datasets, including PubMed, HotpotQA, and Gorilla. This suggests that RAFT could serve as a valuable post-training recipe for adapting pretrained LLMs to in-domain RAG tasks. Future research could explore the applicability of RAFT to a wider range of domains and investigate potential improvements to the technique, such as incorporating more sophisticated retrieval methods or exploring alternative ways of guiding the model's attention to relevant information within the retrieved documents.

Papers of the Week:

LLM Watch

45,745 位关注者

YUNSEOP IM

SE @OMSCS student at GaTech

5 个月

Good good

Khusrav Badalov

Always ?????

5 个月

Thank you for sharing!! Kudos!

Chris Booth

AI Agents @ NatWest ?? ??

6 个月

Tony Hickman Vineet Saini

2 次回应

Patrick Ranger

Newly Qualified IT Specialist in System Integration with a Passion for AI | Creator of an AI Blog | Advancing in Cybersecurity | Eager to Drive Technological Innovation | DM for collab

6 个月

It's fascinating to see the evolution of language models and the ongoing quest for optimal methodologies.

1 次回应

查看更多评论

要查看或添加评论，请登录

Pascal Biese的更多文章

?? Chasing o1: Closing the Reasoning Gap

2024年10月4日

?? Chasing o1: Closing the Reasoning Gap

In this issue: Closing LLMs’ reasoning gaps IBM’s data preparation framework Meta attempting to redefine RLHF Financial…

4 条评论
?? LLMs Are Improving Themselves

2024年9月27日

?? LLMs Are Improving Themselves

In this issue: Self-correcting LLMs The 4 levels of RAG and beyond 9B model beating GPT-4o in RAG MLOps/GenAI World is…

2 条评论
?? A New Neural Architecture (Again)

2024年9月20日

?? A New Neural Architecture (Again)

In this issue: The return of a controversial neural network architecture NVIDIA releasing open frontier models…

4 条评论
?? What Next-Gen RAG Is About

2024年9月13日

?? What Next-Gen RAG Is About

In this issue: Dual-system RAG with photographic memory LLMs coming up with better ideas than humans Taking LLM Graph…

5 条评论
?? The Next Level of CoT Prompting

2024年9月6日

?? The Next Level of CoT Prompting

In this issue: A more strategic way of prompting Closing the open source gap for MoE models The most powerful small…

1 条评论
?? Agents for Time Series Analysis

2024年8月30日

?? Agents for Time Series Analysis

In this issue: Agents doing time series analysis Seamless migration from LLMs to SLMs Fitting your whole codebase into…

11 条评论
??? Agent-ception: When Agents Are Creating Agents

2024年8月23日

??? Agent-ception: When Agents Are Creating Agents

Foreword I've started a new format called "Executive Summaries", where I'll take the time to guide you through…

6 条评论
?? Apple's Answer to Complex LLM Evaluation

2024年8月16日

?? Apple's Answer to Complex LLM Evaluation

In this issue: We aren’t running out of data anytime soon A ToolSandbox for evaluating complex LLM applications…

2 条评论
?? The Downsides of Structured Outputs

2024年8月9日

?? The Downsides of Structured Outputs

In this issue: The downsides of structured outputs From chaining thoughts to thinking on graphs Graph RAG for domain…

18 条评论
?????? Attention Is All Graphs Need

2024年8月2日

?????? Attention Is All Graphs Need

In this issue: Attention might be all you need - even for graphs Mamba-2 is coming for multi-modal Transformers A…

9 条评论

See all articles

?? Moving beyond RAG

Pascal Biese

Daily AI highlights for 60k+ experts ???? AI/ML Engineer

In this issue:

1. LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

领英推荐

2. mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

3. RAFT: Adapting Language Model to Domain Specific RAG

Papers of the Week:

LLM Watch

45,745 位关注者

Pascal Biese的更多文章

社区洞察

其他会员也浏览了

Exploring GraphRAG: A Novel Approach in Retrieval-Augmented Generation

NLP-A Complete Guide for Topic Modeling- Latent Dirichlet Allocation (LDA) using Gensim!

Unleashing the Power of AI: A Deep Dive into RAG vs Fine-Tuning

Knowledge Graphs vs. Vector Databases: The Great Debate for RAG

Why vector database for LLM use cases?

How ACM’s subrogation tool was created by our Data Science team

Understanding Vector Databases: The Future of Data Storage and Retrieval

Understanding Cosine Similarity: A Key Metric in Data Science

???????????? ?????????????????? ?????? ?????? ????????????????????????

Selecting Vector Database for Production

In this issue:

1. LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression

领英推荐

2. mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding

3. RAFT: Adapting Language Model to Domain Specific RAG

Papers of the Week:

LLM Watch

45,745 位关注者

Pascal Biese的更多文章

?? Chasing o1: Closing the Reasoning Gap

?? LLMs Are Improving Themselves

?? A New Neural Architecture (Again)

?? What Next-Gen RAG Is About

?? The Next Level of CoT Prompting

?? Agents for Time Series Analysis

??? Agent-ception: When Agents Are Creating Agents

?? Apple's Answer to Complex LLM Evaluation

?? The Downsides of Structured Outputs

?????? Attention Is All Graphs Need

社区洞察

其他会员也浏览了

Exploring GraphRAG: A Novel Approach in Retrieval-Augmented Generation

NLP-A Complete Guide for Topic Modeling- Latent Dirichlet Allocation (LDA) using Gensim!

Unleashing the Power of AI: A Deep Dive into RAG vs Fine-Tuning

Knowledge Graphs vs. Vector Databases: The Great Debate for RAG

Why vector database for LLM use cases?

How ACM’s subrogation tool was created by our Data Science team

Understanding Vector Databases: The Future of Data Storage and Retrieval

Understanding Cosine Similarity: A Key Metric in Data Science

???????????? ?????????????????? ?????? ?????? ????????????????????????

Selecting Vector Database for Production