Graph RAG - Streamlit chatbot to generate knowledge graph using Neo4J

Graph RAG - Streamlit chatbot to generate knowledge graph using Neo4J

Overview:-

Retrieval-Augmented Generation (RAG), or simply document-based chat, is a significant and commonly used technique in the field of Generative AI. RAG allows developers to guide the outputs of large language models (LLMs) towards a specific input dataset. Fine-tuning LLMs can be a complex and costly process, making RAG an appealing alternative. Simple vector-based RAG might not always provide optimal results, so various efforts are underway to improve accuracy through the RAG process. One such approach is Graph RAG, where nodes and edges are created from input documents, and the retrieval process extracts information from the graph.

Hybrid search article in next article link.

Takeaway from the Article:

This article guides readers through the process of creating a Streamlit-based Retrieval-Augmented Generation (RAG) chatbot. It covers the setup of Neo4j Desktop, the creation of a knowledge graph by uploading documents, and the visualization of this graph within the chatbot. The implementation uses Streamlit, LangChain, Neo4j, and Gorq.


Let's start:-

  1. First download Neo4J desktop version. Anyone can use Neo4J aura cloud version as well. https://neo4j.com/download/
  2. Install python and vs code.
  3. Create a python project and start coding.
  4. Develop a Streamlit UI chatbot as below.

Full python code and video guide is available at bottom of the article.

The following skeleton code is to give an understanding of how to implement above UI.

def streamlit_ui():
     with st.sidebar():
          choice = option_menu('Navigation', ['Home','Simple RAG','RAG with Neo4j']
     
    if choice == 'Home'
         st.title("RAG with multiple techniques")

    if choice == 'Simple RAG'
         source_docs = st.file_uploader(label="Upload a document", type = ['pdf'],        
         accept_multiple_files=True)
         <remaining business logic.>
    if choice == 'RAG with Neo4j'
         source_docs = st.file_uploader(label="Upload a document", type = ['pdf'],        
         accept_multiple_files=True)
         create_graph(source_docs)
         visualize_graph()        

5. From the UI, user will upload a document. That document will pass as an argument and following code will create graph.

#pass the document and split it
def create_graph(source_docs):
          texts = text_splitter.split_documents(source_docs)
          
         # Transform texts to LLM Graphs
         llm_transformers = llm_transformers.convert_to_graph_documents(texts)
         
        graph.add_graph_documents(
              graph_documents,
              baseEntitityLabel = True,
              include_source = True
        )        

6. Open Neo4j and check the DB for the generated Graph.

Neo4J browser UI.

7. Display the Graph in chatbot

# neo4j_data.py
from neo4j import GraphDatabase

# Neo4j connection details
uri = "bolt://localhost:7687"  # Adjust if using a different host/port
username = "neo4j"
password = "password"  # Replace with your password

# Connect to the Neo4j database
driver = GraphDatabase.driver(uri, auth=(username, password))


def fetch_graph_data():
    query = """
    MATCH (n)-[r]->(m)
    RETURN n, r, m
    """
    with driver.session() as session:
        results = session.run(query)
        nodes = []
        edges = []
        for record in results:
            n = record["n"]
            m = record["m"]
            r = record["r"]
            nodes.append(n)
            nodes.append(m)
            edges.append((n.id, m.id, r.type))
        # Removing duplicates
        nodes = {n.id: n for n in nodes}.values()
        return nodes, edges

# Don't forget to close the driver
def close_driver():
    driver.close()
        
# app.py
import streamlit as st
from pyvis.network import Network
import neo4j_data

def visualize_graph():
   # Title of the Streamlit app
   st.title("Neo4j Graph Visualization")

  # Fetch the graph data
  nodes, edges = neo4j_data.fetch_graph_data()

  # Create a PyVis network
  net = Network(height="750px", width="100%", notebook=True)

  # Add nodes and edges to the network
  for node in nodes:
      net.add_node(node.id, label=str(node.id), title=node.labels)

  for edge in edges:
      net.add_edge(edge[0], edge[1], title=edge[2])

  # Generate the network
  net.show("graph.html")

  # Display the network in Streamlit 
  HtmlFile = open("graph.html", "r", encoding="utf-8")
  source_code = HtmlFile.read()
  components.html(source_code, height=800, width=1000)

  # Close the Neo4j driver when done
  neo4j_data.close_driver()        

8. Run Streamlit app, navigate to RAG with Neo4j, upload the document.

Knowledge graph in Chatbot

The entire concept and code walkthrough with python code is available in below video guidance. You can look into it :-


Video link of the Hybrid search retrieval:-


Conclusion :

RAG is important aspect of AI tools. There are multiple improvement process implemented or in process. This article show on how to create the graph. In next article, I will show how to retrieve information both from vector and the graph using Hybrid approach.

Hybrid search article in next article link.


要查看或添加评论,请登录

社区洞察

其他会员也浏览了