Intro on GraphRAG technique for RAG applications

Intro on GraphRAG technique for RAG applications

A technique that is gaining traction for RAG is the GraphRAG, here's in short how it works:

  • split in text chunks as always
  • use an LLM to extract entities, relationships, and key claims from chunks
  • apply the Leiden technique for hierarchical graph clustering
  • create summaries of each community and its components for better dataset understanding (see picture)

GraphRAG representation

Query

To query, you can apply different strategies of what kind of linked nodes you want to include in the context.

It all starts from a traditional KNN vector search, but then, you can:

? Global Search: get a broader "community" cluster of text nodes.

? Local Search:?fetch also linked nodes.

? DRIFT Search:?fetch linked nodes but with the added context of community information.

Examples

For extra details, check also this guide from @Microsoft: https://microsoft.github.io/graphrag/ or this guide (and colab) from LlamaIndex https://docs.llamaindex.ai/en/stable/examples/query_engine/knowledge_graph_rag_query_engine/

Is GraphRAG the right approach for your data?

I find that also this simpler technique proposed by Jerry Liu is a valid alternative when the input documents have chapters: https://github.com/run-llama/llama_parse/blob/main/examples/advanced_rag/dynamic_section_retrieval.ipynb


要查看或添加评论,请登录

Marco D'Alia的更多文章

社区洞察

其他会员也浏览了