登录查看更多内容

Hybrid Graphs for Table-and-Text Based Question Answering Using LLMs

Florent LIU

Data architect, Full Stack Data Engineer in BIG DATA, and Full Stack Developer AI.

发布日期: 2025年2月1日

Background and Motivation

In today’s data-rich environment, information is often scattered across structured (e.g., tables, databases) and unstructured (e.g., raw text) sources.

Answering questions that require reasoning across both types of data—referred to as Table-Text Question Answering (QA)—poses significant challenges.

Current methods often rely on fine-tuning models with high-quality, human-curated data, which is expensive and time-consuming to obtain.

Recent advancements in Large Language Models (LLMs) have shown promise in zero-shot QA tasks, but their application to multi-source Table-Text QA remains underexplored.

Objective

To introduce ODYSSEY, a novel Hybrid Graph-based approach for Table-Text QA that leverages LLMs without fine-tuning.

The goal is to efficiently answer hybrid questions by constructing a unified Hybrid Graph from both tabular and textual data, pruning irrelevant information, and providing the LLM with concise, relevant context.

Key Contributions

1. Hybrid Graph Construction:

- A unified graph is built by integrating structured (table) and unstructured (text) data. The graph captures relationships between entities in the table and linked documents.

- The graph is pruned based on the input question to filter out noise and retain only relevant information.

2. Zero-Shot QA Framework:

- The system operates in a zero-shot setting, meaning it does not require fine-tuning or labeled training data.

- It uses LLMs (GPT-3.5, GPT-4, LLaMA-3) to answer questions by leveraging the pruned Hybrid Graph.

3. Efficiency Improvements:

- The approach reduces token usage by up to 53% compared to providing the full context (table and text) to the LLM.

- It achieves state-of-the-art (SoTA) performance on the Hybrid-QA and OTT-QA datasets, improving Exact Match (EM) scores by 10% on Hybrid-QA and 5.4% on OTT-QA.

Methodology

The ODYSSEY framework consists of four main steps:

1. Question Analysis:

- The input question is analyzed to extract key entities and map them to relevant table headers. This step identifies the necessary information to answer the question.

2. Hybrid Graph Construction:

- A sub-table is retrieved based on the relevant headers, and an Entity-Document Graph is constructed by linking entities from the text to the table cells.

- The two components (sub-table and Entity-Document Graph) are integrated into a single Hybrid Graph.

3. Hybrid Graph Traversal:

- The graph is pruned using Breadth-First Search (BFS) to retain only the most relevant paths for answering the question.

- The pruned graph is stored in a hop-wise dictionary, where each hop represents a level of traversal (1-hop, 2-hop, 3-hop).

4. Reader LLM:

- The pruned graph is passed to the LLM in a hop-wise manner. If the LLM cannot answer the question with the initial hop, additional hops are provided until the answer is found or the full context is used.

领英推荐

Creating a Product Support AI Agent using Natural…

Kingsley Uyi Idehen 5 个月前

???????????? ?????????????????? ?????? ?????? ????????????????????????

???????????? ?????????????????? ?????? ??????…

Sanjay Kumar MBA,MS,PhD 1 年前

Qwen Truth about embeddings for RAG Hype

Sander Stepanov , Ph.D. TURNING DATA TO MONEY 1 个月前

Evaluation and Results

The system was evaluated on two challenging datasets:

1. Hybrid-QA: A multi-hop Table-Text QA dataset based on Wikipedia.

2. OTT-QA: An open-domain QA dataset requiring retrieval of both tables and text.

Key Findings:

- Performance: ODYSSEY outperformed all baselines, achieving 58.4% EM on Hybrid-QA and 62.02% EM on OTT-QA using GPT-4. It also showed strong performance with smaller models like LLaMA-3-8B.

- Token Efficiency: The approach reduced input token size by 45.5% on Hybrid-QA and 53% on OTT-QA, significantly lowering computational costs.

- Hop-wise Analysis: Nearly 90% of questions were answered using 1-hop or 2-hop connections, demonstrating the effectiveness of the Hybrid Graph in filtering irrelevant information.

Comparison with Baselines:

- ODYSSEY consistently outperformed baselines like Base w/ Table & Text and Base w/ Summarized Text across all metrics (EM, F1, Precision, Recall).

- It also outperformed fine-tuned models on OTT-QA and achieved comparable results on Hybrid-QA, despite operating in a zero-shot setting.

Ablation Studies

1. Hop-wise Retrieval: Passing all pruned information at once (instead of hop-wise) resulted in a slight performance drop and increased token usage.

2. Pruned Graph: Using the entire Hybrid Graph (without pruning) led to lower accuracy and higher token costs, highlighting the importance of pruning for efficiency.

Error Analysis

The system’s errors were categorized into:

1. Formatting Errors: Differences in answer formatting (e.g., "Regis Philbin" vs. "Regis Philbin,") affected EM scores.

2. Semantic Module Errors: Issues in entity matching, extraction, and header mapping.

3. LLM Errors: The LLM occasionally failed to provide correct answers despite having the necessary context.

4. Dataset Issues: Ambiguous questions or anomalies in the dataset.

Conclusion

ODYSSEY introduces a zero-shot, fine-tuning-free approach for Table-Text QA, leveraging a Hybrid Graph to efficiently navigate multi-hop reasoning across structured and unstructured data. The system achieves state-of-the-art performance while significantly reducing token usage, making it a scalable solution for real-world applications. Future work could explore extending the approach to multi-modal datasets (e.g., images, videos) and improving entity-matching capabilities.

Limitations

1. Processing Time: The system incurs slightly more processing time than zero-shot baselines due to additional LLM calls and graph traversal.

2. Dependence on LLM Capabilities: Performance is tied to the capabilities of the underlying LLM, which may evolve over time.

3. Scope: The current implementation is limited to Table-Text data, but the approach could be extended to other multi-modal datasets.

Key Takeaways

- Efficiency: ODYSSEY reduces token usage by up to 53%, making it cost-effective for large-scale QA tasks.

- Performance: It achieves SoTA results on Hybrid-QA and OTT-QA, outperforming fine-tuned models in some cases.

- Scalability: The zero-shot, fine-tuning-free approach makes it adaptable to various domains without the need for expensive labeled data.

要查看或添加评论，请登录

Florent LIU的更多文章

Comparing OpenAI’s new Response API + Agents SDK with Anthropic’s Model Context Protocol (MCP)

2025年3月19日

Comparing OpenAI’s new Response API + Agents SDK with Anthropic’s Model Context Protocol (MCP)

Below is a deep analysis comparing OpenAI’s new Response API + Agents SDK with Anthropic’s Model Context Protocol…
ReMA: Learning to Meta-Think for LLMs with Multi-Agent Reinforcement Learning

2025年3月15日

ReMA: Learning to Meta-Think for LLMs with Multi-Agent Reinforcement Learning

1. Core Concept: Meta-Thinking in LLMs Problem Statement: Current LLMs struggle with adaptive reasoning in complex…
L'audace de l'innovation : Transformer l'échec en opportunité

2025年3月12日

L'audace de l'innovation : Transformer l'échec en opportunité

Depuis toujours, la Tour Montparnasse est per?ue comme l’un des immeubles les plus laids par les Parisiens, alors que…
The critical role of mathematical frameworks in advancing AI agent

2025年3月2日

The critical role of mathematical frameworks in advancing AI agent

Below is a refined breakdown of the core mathematical and architectural contributions from the paper "G-Retriever:…
Overview of Popular AI Frameworks

2025年3月2日

Overview of Popular AI Frameworks

1. Overview of Popular AI Frameworks Popular AI frameworks such as TensorFlow, PyTorch, JAX, and Keras have…
Unlocking Enterprise Insights: How Palantir's AI Knowledge Database Transforms B2B Decision-Making

2025年2月28日

Unlocking Enterprise Insights: How Palantir's AI Knowledge Database Transforms B2B Decision-Making

Below is a detailed analysis of how Palantir delivers B2B business value through its AI Knowledge Enterprise Database…
AI Knowledge Enterprise Database

2025年2月28日

AI Knowledge Enterprise Database

An AI Knowledge Enterprise Database is a smart, AI-powered data management system designed to store, organize, and…
Azure Synapse vs. AWS: Matching Data Analytics & Warehousing Solutions

2025年2月28日

Azure Synapse vs. AWS: Matching Data Analytics & Warehousing Solutions

The similar service to Azure Synapse Analytics in AWS is Amazon Redshift combined with AWS Glue and Amazon EMR. Since…
MindMap: Knowledge Graph Prompting Graph of Thoughts in Large Language Models

2025年2月25日

MindMap: Knowledge Graph Prompting Graph of Thoughts in Large Language Models

Introduction The article introduces MindMap, a novel framework that integrates knowledge graphs (KGs) with large…
The differences between "Term", "Match Phrase", and "Query String" queries on ElasticSearch

2025年2月25日

The differences between "Term", "Match Phrase", and "Query String" queries on ElasticSearch

Elasticsearch provides different types of queries for searching text and structured data. Here’s a breakdown of the…

See all articles

Hybrid Graphs for Table-and-Text Based Question Answering Using LLMs

Florent LIU

Data architect, Full Stack Data Engineer in BIG DATA, and Full Stack Developer AI.

领英推荐

Florent LIU的更多文章

社区洞察

其他会员也浏览了

Agentic RAG solution for LLMs which can understand PDFs with mutliple images and diagrams

RAG Failure Points and Optimization Strategies: A Deep?Dive

RAG (Retrieval-Augmented Generation):Technical explanation of each word

Retrieval-Augmented Generation (RAG) applied to Stable Diffusion image models

Text classification with Neo4j-GraphRAG using Knowledge Graph Agent

RAG vs CAG vs Fine-tuning: A Deep Dive into Faster, Smarter Knowledge Integration

Introduction To Retrieval Augmented Generation (RAG)

Beyond Boundaries: A Human-like Approach for Question Answering over Structured and Unstructured Information Sources

Understanding Vector Databases: Their Role in LLMs and LVMs, Efficiency in Transformer Algorithms, and Key Security Considerations

领英推荐

Florent LIU的更多文章

Comparing OpenAI’s new Response API + Agents SDK with Anthropic’s Model Context Protocol (MCP)

ReMA: Learning to Meta-Think for LLMs with Multi-Agent Reinforcement Learning

L'audace de l'innovation : Transformer l'échec en opportunité

The critical role of mathematical frameworks in advancing AI agent

Overview of Popular AI Frameworks

Unlocking Enterprise Insights: How Palantir's AI Knowledge Database Transforms B2B Decision-Making

AI Knowledge Enterprise Database

Azure Synapse vs. AWS: Matching Data Analytics & Warehousing Solutions

MindMap: Knowledge Graph Prompting Graph of Thoughts in Large Language Models

The differences between "Term", "Match Phrase", and "Query String" queries on ElasticSearch

社区洞察

其他会员也浏览了

Agentic RAG solution for LLMs which can understand PDFs with mutliple images and diagrams

RAG Failure Points and Optimization Strategies: A Deep?Dive

RAG (Retrieval-Augmented Generation):Technical explanation of each word

Retrieval-Augmented Generation (RAG) applied to Stable Diffusion image models

Text classification with Neo4j-GraphRAG using Knowledge Graph Agent

RAG vs CAG vs Fine-tuning: A Deep Dive into Faster, Smarter Knowledge Integration

Introduction To Retrieval Augmented Generation (RAG)

Beyond Boundaries: A Human-like Approach for Question Answering over Structured and Unstructured Information Sources

Understanding Vector Databases: Their Role in LLMs and LVMs, Efficiency in Transformer Algorithms, and Key Security Considerations