登录查看更多内容

When to Use GraphRAG

Louis-Fran?ois Bouchard

Making AI accessible. ?? What's AI on YouTube. Co-founder at Towards AI. ex-PhD Student.

发布日期: 2024年8月12日

Good morning everyone! In this iteration, we focus on the new hype in LLMs: GraphRAG.

GraphRAG is a powerful extension to the Retrieval-Augmented Generation (RAG) stack making a lot of noise thanks to Microsoft and LlamaIndex’s contributions.

But the question remains: Should YOU be using it?

To answer when we need it, we first need to understand what it is...

This issue is brought to you thanks to Yandex.

1?? From QuIP to AQLM with PV-Tuning: LLM compression at the extreme

The trade-off between large model size and computational efficiency has long been a challenge in deploying language models. The research community has been looking to reduce model size by 8 times, down to 2 bits. This year, they found a way to do it without sacrificing model performance.?

Here's the story behind the evolution of extreme LLM compression methods.

2?? GraphRAG

Before we start, this is a piece I made along with two friends working at Towards AI for our weekly High Learning Rate newsletter (which you should follow), where we share real-world solutions for real-world problems, and do our best to teach to leverage AI's potential with insider tips from specialists in the field, every week.

What is GraphRAG?

GraphRAG enhances traditional RAG by incorporating knowledge graphs into the retrieval process. Instead of relying solely on vector similarity (comparing numbers to find the most relevant ‘similar’ matches), GraphRAG extracts entities and relationships from your data, creating a structured representation that captures semantic connections. Semantic means understanding the meaning behind words or data, in a specific context, not just their literal definitions. This approach allows for more nuanced and context-aware retrieval, potentially leading to more accurate and comprehensive responses from your LLM.

A knowledge graph is simply a structured representation of data that captures entities and their relationships, allowing for better understanding and retrieval of information.

When to Use GraphRAG: It's All About Your Data

The decision to implement GraphRAG heavily depends on your dataset's nature. If your data is rich in interconnected entities and relationships - think academic papers (many cite each other and progress in time), corporate knowledge bases, or complex historical records - GraphRAG might outperform regular RAG. It’s perfect for capturing and leveraging these connections, enabling more informed and contextually relevant retrievals that standard RAG might miss.

User Queries: Complexity is Key

GraphRAG is most useful when dealing with complex, multi-faceted queries that require traversing multiple pieces of information (or asking meta-questions about the data itself, such as “How many papers have been published between 2010 and 2020 about RAG” (Spoiler: 0)). If your users frequently ask questions like "How does the theory proposed in Paper A relate to the findings in Paper B, and what are the implications for field C?", GraphRAG's ability to navigate and synthesize information across your knowledge graph becomes essential, whereas regular RAG might just bring out the most relevant chunks to some of these topics, and the LLM might hallucinate the rest.

Data Storage Considerations

While GraphRAG can work with various data storage systems, it's particularly powerful when your data is already structured in a graph-like format or can be easily transformed into one. Graph databases like Neo4j or Amazon Neptune are natural fits, but even relational databases can be leveraged if you have a clear understanding of the relationships between your data entities.

领英推荐

How Good Are the Latest Open LLMs? And Is DPO Better…

Sebastian Raschka, PhD 10 个月前

Top LLM Papers of the Week (November Week 2, 2024)

Kalyan KS 4 个月前

To Data & Beyond Week 17 Summary

Youssef Hosni 11 个月前

p.s. ideally, you want a dataset built for that with relationship information (such as who is citing who), but you do not necessarily need that. Fortunately for us, libraries like Microsoft’s GraphRAG do that automatically, using the best LLM to find our entities and relationships.

When to Skip GraphRAG

Despite its power, GraphRAG isn't always the best choice. For simpler datasets (and single-faceted queries) with straightforward relationships or when dealing primarily with structured text documents, traditional RAG or advanced search methods might be more efficient. Advanced methods include hybrid search, which combines vector similarity and keyword search, or techniques that use metadata filtering to narrow down search possibilities.

It’s important to note that GraphRAG introduces additional complexity and computational overhead in index creation and query processing, which may not be justified for straightforward information lookup tasks. This is an example from Microsoft’s paper comparing traditional RAG and GraphRAG for the same query:

Even though the results are more interesting, GraphRAG requires almost 10x more time and 10x more tokens to produce. Make sure you need it!

Combining Approaches: The Router Strategy

In real-world applications, a one-size-fits-all approach rarely works. Consider implementing a router system that can dynamically choose between GraphRAG, Advanced RAG, text-to-SQL retrieval, or any other search method based on the query type and available data. This flexible approach ensures you're using the most appropriate retrieval method for each specific query, optimizing both performance and accuracy. You will need a good base LLM and prompt to re-orient your queries to the right retrieval system.

TL;DR: GraphRAG - Powerful but Not Universal

GraphRAG offers a significant improvement in information retrieval capabilities for complex, interconnected datasets and queries requiring deep relational understanding. However, it comes with increased complexity and resource requirements. Evaluate your specific use case, data structure, and query patterns carefully. For many applications, a combination of retrieval methods, orchestrated by a smart router, will provide the best balance of performance and flexibility.

And that's it for this iteration! I'm incredibly grateful that?the What's AI newsletter?is now read by over 20,000 incredible human beings. Click here to share this iteration with a friend if you learned something new!

Looking for more cool AI stuff? ??

Looking for AI news, code, learning resources, papers, memes, and more? Follow our weekly newsletter at Towards AI!
Looking to connect with other AI enthusiasts? Join the Discord community: Learn AI Together!

Want to share a product, event or course with my AI community? Reply directly to this email, or visit my Passionfroot profile to see my offers.

Thank you for reading, and I wish you a fantastic week! Be sure to have?enough sleep and physical activities next week!

Louis-Fran?ois Bouchard

The What's AI Newsletter

14,776 位关注者

Wadah S.

Leading AI & Gen AI Initiatives @ Forrester

7 个月

Thanks for covering this topic. As you mentioned, there are different approaches to search. Taking this further, how does (or doesn't) GraphRAG align with conversational tools?

Ahmed Moubtahij

ing. | MSc. | NLP Engineer | LLMs | GenAI

7 个月

;)

1 次回应

Brendon Ribeiro

7 个月

Thanks for sharing this valuable information. I will make more use of GraphRAG.

1 次回应

查看更多评论

要查看或添加评论，请登录

Louis-Fran?ois Bouchard的更多文章

How ChatGPT Actually Works - no math, no code

2025年3月23日

How ChatGPT Actually Works - no math, no code

You might have heard that AI can do all sorts of mind-blowing stuff, from talking to you like a human to generating…

1 条评论
How FlashMLA Cuts KV Cache Memory to 6.7%

2025年3月20日

How FlashMLA Cuts KV Cache Memory to 6.7%

Good morning everyone! This is Louis-Fran?ois from Towards AI, and if you’ve watched my previous videos on embeddings…

1 条评论
OpenAI's NEW Fine-Tuning Method Changes EVERYTHING

2025年3月17日

OpenAI's NEW Fine-Tuning Method Changes EVERYTHING

Good morning! Have you ever wanted to take a language model and make it answer the way you want without needing a…
Python Programming with AI

2025年3月7日

Python Programming with AI

Good morning, and welcome to this very first video lesson of our Python course! Whether you’re someone who has dabbled…

1 条评论
Want to start programming in the AI era? This is for you...

2025年2月28日

Want to start programming in the AI era? This is for you...

Good morning! If you’ve been wanting to break into AI development but feel like your coding foundation isn’t quite…
Using AI for Writing

2025年2月17日

Using AI for Writing

Good morning! We’ve (Towards AI) been using AI to research, plan, help us with drafts, and refine our lessons for our…

4 条评论
How LLMs Are Changing Every Job

2025年2月12日

How LLMs Are Changing Every Job

Good morning! Today, I’m sharing our third video out of 6 we made for our “8-hour Generative AI Primer” course. In this…
LLM Developers: The future of software development

2025年2月6日

LLM Developers: The future of software development

Software engineers vs. ML engineers vs.

1 条评论
Real Agents vs. Workflows

2025年2月3日

Real Agents vs. Workflows

What most people call agents aren’t agents. I’ve never really liked the term “agent”, until I saw this recent article…

1 条评论
CAG vs RAG: Which One to Use?

2025年1月30日

CAG vs RAG: Which One to Use?

If you're using ChatGPT or other AI models, you've probably noticed they sometimes give incorrect information or…

3 条评论

See all articles

When to Use GraphRAG

Louis-Fran?ois Bouchard

Making AI accessible. ?? What's AI on YouTube. Co-founder at Towards AI. ex-PhD Student.

1?? From QuIP to AQLM with PV-Tuning: LLM compression at the extreme

2?? GraphRAG

What is GraphRAG?

When to Use GraphRAG: It's All About Your Data

User Queries: Complexity is Key

Data Storage Considerations

领英推荐

When to Skip GraphRAG

Combining Approaches: The Router Strategy

TL;DR: GraphRAG - Powerful but Not Universal

The What's AI Newsletter

14,776 位关注者

Louis-Fran?ois Bouchard的更多文章

社区洞察

其他会员也浏览了

???? The Next Impact Factor

The LLaMA Takeover

?? Infinite Text Input? This changes everything.

To Data & Beyond Week 3 Summary

Introducing Mixtral-8x22B: The new open model from Mistral outperforms all existing open LLMs ??

???????????? ?????????????????? ?????? ?????? ????????????????????????

Dave Tales Edition #26 | Exploring Vector Data Storage Techniques in Large Language Models

Insider’s Edit: The 12 Top AI Articles from 2022

Edition 28 – How Well Do LLMs Conduct Numeric Evaluations?

Agentic RAG solution for LLMs which can understand PDFs with mutliple images and diagrams

1?? From QuIP to AQLM with PV-Tuning: LLM compression at the extreme

2?? GraphRAG

What is GraphRAG?

When to Use GraphRAG: It's All About Your Data

User Queries: Complexity is Key

Data Storage Considerations

领英推荐

When to Skip GraphRAG

Combining Approaches: The Router Strategy

TL;DR: GraphRAG - Powerful but Not Universal

The What's AI Newsletter

14,776 位关注者

Louis-Fran?ois Bouchard的更多文章

How ChatGPT Actually Works - no math, no code

How FlashMLA Cuts KV Cache Memory to 6.7%

OpenAI's NEW Fine-Tuning Method Changes EVERYTHING

Python Programming with AI

Want to start programming in the AI era? This is for you...

Using AI for Writing

How LLMs Are Changing Every Job

LLM Developers: The future of software development

Real Agents vs. Workflows

CAG vs RAG: Which One to Use?

社区洞察

其他会员也浏览了

???? The Next Impact Factor

The LLaMA Takeover

?? Infinite Text Input? This changes everything.

To Data & Beyond Week 3 Summary

Introducing Mixtral-8x22B: The new open model from Mistral outperforms all existing open LLMs ??

???????????? ?????????????????? ?????? ?????? ????????????????????????

Dave Tales Edition #26 | Exploring Vector Data Storage Techniques in Large Language Models

Insider’s Edit: The 12 Top AI Articles from 2022

Edition 28 – How Well Do LLMs Conduct Numeric Evaluations?

Agentic RAG solution for LLMs which can understand PDFs with mutliple images and diagrams