ç™»å½•æŸ¥çœ‹æ›´å¤šå†…å®¹

ç‚¹å‡»â€œç»§ç»åŠ å…¥æˆ–ç™»å½•â€ï¼Œå³è¡¨ç¤ºæ‚¨åŒæ„éµå®ˆé¢†è‹±çš„ã€Šç”¨æˆ·åè®®ã€‹ã€ã€Šéšç§æ”¿ç–ã€‹åŠã€ŠCookie æ”¿ç–ã€‹ã€‚

Training Agentic Graph Systems for Orchestration: Beyond Hardcoded Workflows

Jon Brewton

Founder and CEO - USAF Vet; M.Sc. Eng; MBA; HBAPer: Data Squared has Created the Only Hallucination Resistant and Fully Explainable AI Solution Development Platform in the world!

å‘å¸ƒæ—¥æœŸ: 2025å¹´3æœˆ17æ—¥

At Data2, we're confronting one of the most significant challenges in modern AI: the gap between true agency and the hardcoded workflows that merely mimic it. While much of the industry celebrates "AI agents," we've recognized that most of these systems are fundamentally constrained by predetermined orchestration patterns rather than exhibiting genuine decision-making autonomy.

The Illusion of Agency

Current AI systems labeled as "agents" typically operate within strict boundaries:

They follow rigid, predetermined sequences of operations
They lack the ability to dynamically determine when to use different capabilities
They struggle with contradictory information from different sources
They create the illusion of agency through complex but ultimately inflexible workflows

As we've learned through our work with knowledge graphs and AI integration, Large Language Models (LLMs) possess powerful capabilities like reasoning, search, memory, and planning, but they're not trained to orchestrate these abilities effectively. The result? Systems that appear intelligent but break down when facing novel scenarios requiring adaptive capability deployment.

From Prompt Engineering to Reward Engineering

The field of AI is witnessing a paradigm shift from traditional prompt engineering to a more powerful approach that could be called "reward engineering." This fundamental change in how AI systems learn to orchestrate their capabilities offers several significant advantages:

Outcome-Focused Design: Rather than prescribing exact procedural steps, this approach defines what success looks like and allows the system to discover how to achieve it
Experiential Learning: Models can discover optimal orchestration strategies through experience and iteration
Adaptive Behavior Development: Reinforcement learning enables truly adaptive behaviors to emerge from simple reward signals
Novel Situation Handling: The resulting systems can handle previously unseen scenarios without requiring explicit retraining

The Search-R1 methodology exemplifies this approach, demonstrating how reinforcement learning can teach LLMs to autonomously decide when to search for information during reasoning. Instead of following explicit rules about when to search, models learn through experience to recognize situations where searching would provide valueâ€”a capability that proves difficult to encode through traditional prompting methods.

Link to video overview of the process: https://youtu.be/JIsgyk0Paic?si=ddPGN7LMWxudx_OY

Research Paper: https://arxiv.org/abs/2503.09516

Special thanks to Anthony Alcaraz for the information share

Why Knowledge Graphs Are Essential for True Agency

Our work with graph-based systems has revealed why they form the optimal foundation for agentic AI:

Rich State Representation: Graphs provide clear state representations for RL algorithms, where nodes represent knowledge states and edges represent actions or transitions.
Decision Pathway Modeling: Unlike linear sequences, graphs can represent branching decision paths, maintaining rich contextual relationships between options.
Flexible Traversal: Graph structures support dynamic navigation through complex decision spaces, essential for adapting to novel scenarios.
Context Preservation: Knowledge graphs maintain relationships between concepts, enabling more sophisticated reasoning about when to deploy specific capabilities.

Our reView platform leverages these properties to create AI systems that go beyond mere tool use to achieve genuine orchestration intelligence.

Technical Innovations: Retrieved Token Masking

A critical innovation in our approach is "retrieved token masking," which prevents optimization of tokens from external sources while allowing the model to learn effective query generation and reasoning strategies. This technique solves a fundamental challenge in applying reinforcement learning to systems that incorporate external knowledge:

It ensures policy gradient updates affect only the model's orchestration decisions
It prevents the model from gaming the reward function by manipulating external content
It maintains the integrity of retrieved information while optimizing how it's utilized
It creates a clean separation between knowledge access and decision-making

This approach allows our systems to learn when and how to access external knowledge sources without compromising the quality or reliability of the information retrieved.

Beyond Correctness: Outcome-Based Reward Functions

Our research has shown that simple outcome-based reward functions focused on final correctness can lead to the development of surprisingly sophisticated behaviors. Instead of trying to engineer every aspect of agent behavior, we define success metrics and allow the system to discover optimal pathways.

This approach produces AI systems that:

Dynamically decide which capabilities to employ based on the specific situation
Optimize resource utilization by only deploying expensive operations when necessary
Adaptively respond to novel scenarios without explicit reprogramming
Develop emergent strategies that human engineers might not have considered

The Path Forward

As we continue to advance the state of the art in agentic AI, our focus remains on developing systems that exhibit true orchestration intelligence rather than merely following predefined workflows. By combining graph based knowledge structures with reinforcement learning approaches to capability orchestration, we're creating AI agents that can:

Autonomously decide when different capabilities would be valuable
Effectively integrate information from multiple sources while handling contradictions
Learn from experience to improve orchestration strategies over time
Adapt to novel scenarios without requiring explicit reprogramming

Contact our team to learn how Data2 can help your organization move beyond hardcoded AI workflows to true agentic systems that deliver robust, adaptive intelligence for your most challenging use cases.

Daniel Bukowski

1 å‘¨

Jon Brewton automated workflows guided by graphs will be key to deploying "agents" or whatever we want to call them into meaningful production.

èµž

å›žå¤

1 æ¬¡å›žåº”

è¦æŸ¥çœ‹æˆ–æ·»åŠ è¯„è®ºï¼Œè¯·ç™»å½•

Jon Brewtonçš„æ›´å¤šæ–‡ç«

The Reality Gap in Gen AI Agent Autonomy: Why Planning and Reasoning Matter

2025å¹´3æœˆ20æ—¥

The Reality Gap in Gen AI Agent Autonomy: Why Planning and Reasoning Matter

Executive Summary While excitement builds around GenAI agents potentially automating complex tasks, significantâ€¦

1 æ¡è¯„è®º
Traditional RAG vs. Graph RAG vs. Data2: A Comprehensive Analysis

2025å¹´3æœˆ19æ—¥

Traditional RAG vs. Graph RAG vs. Data2: A Comprehensive Analysis

Traditional RAG (Retrieval-Augmented Generation) Traditional RAG combines pre-trained language models with retrievalâ€¦
Bridging Worlds: Why Graph-to-Object Mapping Is Critical for AI Success

2025å¹´3æœˆ16æ—¥

Bridging Worlds: Why Graph-to-Object Mapping Is Critical for AI Success

Following my previous analysis on why traditional data approaches are setting organizations up for AI failure, I wantâ€¦

4 æ¡è¯„è®º
Data Squared Submits Response to Federal AI Action Plan RFI

2025å¹´3æœˆ14æ—¥

Data Squared Submits Response to Federal AI Action Plan RFI

Executive Summary Data Squared USA Inc. (Data2) respectfully submits this comprehensive response to the Office ofâ€¦

2 æ¡è¯„è®º
The Hidden Costs of Traditional Ontology Only Approaches to Scaled AI Development

2025å¹´3æœˆ3æ—¥

The Hidden Costs of Traditional Ontology Only Approaches to Scaled AI Development

As Arvind Jain, Glean co-founder & CEO, aptly noted: "For agentic AI, there's so much friction that comes withâ€¦

5 æ¡è¯„è®º
Letâ€™s Talk About Feature Engineering

2023å¹´3æœˆ23æ—¥

Letâ€™s Talk About Feature Engineering

In the realm of digital transformation and solution development, there is an art known as feature engineering. Thisâ€¦

1 æ¡è¯„è®º
Embracing Systems Thinking: A Holistic Approach to Problem Solving

2023å¹´3æœˆ22æ—¥

Embracing Systems Thinking: A Holistic Approach to Problem Solving

Introduction Systems thinking is an approach to understanding, analyzing, and solving complex problems by examining theâ€¦

2 æ¡è¯„è®º
How Using Knowledge Graphs can Optimize the Oil and Gas Industry

2023å¹´3æœˆ21æ—¥

How Using Knowledge Graphs can Optimize the Oil and Gas Industry

Introduction Knowledge graphs are a type of database that uses graph theory to store, organize, and retrieveâ€¦
How statistical models can improve financial management and cost predictability in industrial sectors

2023å¹´3æœˆ19æ—¥

How statistical models can improve financial management and cost predictability in industrial sectors

In today's complex and data-rich business environment, financial management and cost predictability are crucial forâ€¦

See all articles

The Illusion of Agency

From Prompt Engineering to Reward Engineering

Why Knowledge Graphs Are Essential for True Agency

Technical Innovations: Retrieved Token Masking

Beyond Correctness: Outcome-Based Reward Functions

The Path Forward

Jon Brewtonçš„æ›´å¤šæ–‡ç«

The Reality Gap in Gen AI Agent Autonomy: Why Planning and Reasoning Matter

Traditional RAG vs. Graph RAG vs. Data2: A Comprehensive Analysis

Bridging Worlds: Why Graph-to-Object Mapping Is Critical for AI Success

Data Squared Submits Response to Federal AI Action Plan RFI

The Hidden Costs of Traditional Ontology Only Approaches to Scaled AI Development

Letâ€™s Talk About Feature Engineering

Embracing Systems Thinking: A Holistic Approach to Problem Solving

How Using Knowledge Graphs can Optimize the Oil and Gas Industry

How statistical models can improve financial management and cost predictability in industrial sectors

ç¤¾åŒºæ´žå¯Ÿ