Agentic Reasoning: Reasoning LLMs with Tools for Deep Research

Agentic Reasoning: Reasoning LLMs with Tools for Deep Research


1. Introduction

Agentic Reasoning is a framework that enhances Large Language Model (LLM) reasoning by integrating external tool-using agents, including web search, code execution, and structured reasoning-context memory.

Unlike traditional LLM-based reasoning, which relies solely on internal inference, Agentic Reasoning dynamically engages external sources to improve logical deduction, fact retrieval, and problem-solving accuracy.

The framework introduces the Mind Map agent, which constructs a structured knowledge graph to track logical relationships, enhancing deductive reasoning.

It also integrates web search and coding agents to retrieve real-time information and perform computational analysis, significantly outperforming retrieval-augmented generation (RAG) systems and closed-source LLMs in complex research tas


2. Core Methodology

Agentic Reasoning follows a multi-agent architecture where LLMs interact with external tools. The reasoning process dynamically integrates four key components:

  • Task Instruction (o): Defines the reasoning objective.
  • Query (q): Represents the complex question requiring multi-step reasoning.
  • External Tool Outputs (e): Information retrieved from web search, coding execution, or memory graphs.
  • Reasoning Memory (k): Structured knowledge stored from previous reasoning steps.

The system uses a probability model:


where rr represents the reasoning steps, and aa is the final answer. The model optimizes both through structured retrieval and external agent interactions.


3. The Agentic Reasoning Pipeline

The framework enables LLMs to autonomously determine when additional information is required, triggering specialized tokens that call external agents:

  • Web-search token: Retrieves real-time information.
  • Coding token: Executes calculations and simulations.
  • Mind Map token: Stores and organizes reasoning context.

This agent-based interaction ensures that the reasoning model retrieves, refines, and structures information dynamically, rather than relying solely on pre-trained knowledge.


4. Key Components of Agentic Reasoning

  • Mind Map Agent

- Constructs a real-time knowledge graph by structuring logical relationships from reasoning chains.
- Uses community clustering to group reasoning contexts and generate summaries.
- Functions as an external memory tool, allowing LLMs to track arguments, clarify 
ambiguities, and retrieve past deductions.        

  • Web-Search Agent

- Retrieves real-time and context-aware information from the web.
- Extracts concise summaries that match reasoning tasks, such as:
- Numerical values (e.g., “What is the population of the US in 2024?”).
- Nuanced perspectives for open-ended topics.
- Evidence validation for hypothesis-driven queries.        

  • Coding Agent

- Offloads computation-heavy tasks to a specialized coding LLM.
- Executes quantitative analysis and returns structured outputs.
- Ensures separation of reasoning and execution, improving coherence.        

5. Main Findings and Insights

  1. Minimal Tool Selection Improves Performance:
  2. Delegating Tasks to Specialized LLMs Enhances Efficiency:
  3. Test-Time Scaling and Verifiability:


6. Experimental Results

Evaluation on GPQA (PhD-Level Scientific Reasoning Benchmark):

  • Agentic Reasoning significantly outperformed state-of-the-art LLMs in physics, chemistry, and biology.
  • Key results:

Case Study: Medical Decision-Making

  • The model computed FiO2 (Fraction of Inspired Oxygen) via the coding agent.
  • It retrieved PEEP (Positive End-Expiratory Pressure) values via web search.
  • Combined insights for an optimal treatment plan, demonstrating real-world applicability.

Comparison with Human Experts:

  • Surpassed human experts in physics, chemistry, and biology in the GPQA Extended Set.
  • Higher pass rates in deep research tasks across finance, law, and medicine.


7. Future Implications

  1. Scaling to Multimodal Data: Future work will integrate images, charts, and tabular data for complex reasoning.
  2. Reinforcement Learning with Agentic Tools: Using tool usage as a reward signal could further optimize reasoning strategies.
  3. Enhanced Human-AI Collaboration: Agentic frameworks could power research assistants, scientific discovery, and automated expertise synthesis.


8. Conclusion

Agentic Reasoning redefines LLM reasoning by integrating external tools dynamically. It outperforms traditional models in expert-level knowledge tasks and research-driven problem-solving by leveraging structured memory, real-time search, and computational agents.

Future improvements in multimodal reasoning, reinforcement learning, and domain-specific tools will further enhance its ability to tackle complex real-world challenges.

#AI #DataScience #data #generative ai #reinforcement learning optimization #model optimization techniques #fine tuning llms

Follow me on LinkedIn: www.dhirubhai.net/comm/mynetwork/discovery-see-all?usecase=PEOPLE_FOLLOWS&followMember=florentliu

要查看或添加评论,请登录

Florent LIU的更多文章

社区洞察

其他会员也浏览了