A Decentralized AI/KG Web
Copyright 2025 Kurt Cagle / The Cagle Report
An Interesting Week
This has been an interesting week. On Sunday, a Chinese firm backed by the Chinese government released DeepSeek R1, an open-source, open weight large language model confabulators (LLMs) that cost only about $6 million to train on commodity GPU hardware. Since then, three other FOSS Confabulators also debuted, ostentibly based on DeepSeek. On Monday, in response to this, the Nasdaq lost a lot of ground as SoftBank, MBC, nVidia and others lost 10% of their market capitalization in the space of a single trading session, likely making Stargate, a massive AI-based medical portal project with ostensible valuations of $500 billlion dollars announced at Trump's inaugural, dead in the water before it even left drydock.
By Wednesday, OpenAI was accusing DeepSeek of intellectual property theft. Friday, OpenAI announced that it would be managing nuclear missile security for the US government, in one of the most blatant cases of nepotism and corruption in the last couple of decades, to collective gasps of WTF, which will likely result in significant Congressional investigations, even as Sam Altman cried out for even more investment into OpenAI (I suspect that his current set of investors are all getting very nervous).
What we're seeing now is a phenomenon that we've seen many times before in the last few decades: a new technology or paradigm creates a favorable position for consolidation and centralization, with one or just a couple of companies dominating in a specific space. In the 1990s, that was Apple and then Microsoft in operating system, Oracle in databases, Google in Search, Compuserve and then AOL (which bought up compuserve) in Social Media, MySpace in Social media, Facebook in Social Media, Apple in cellphones, IBM in desktops, Dell in Laptops, Intel in microprocessors, and so forth and so on.
COTS vs FOSS
This is the challenge we face - the primacy of Commercial Off The Shelf Software (COTS) vs., Free Open Source Software (FOSS). This has been a tension that has existed within IT since its inception, and it will play out here as well.
Yet if you look at the history of computing, what becomes most striking is that the trends actually point to an increasingly decentralized future and protocol standardization. Linus Torvald introduced Linux as a free and open-source operating system in 1991. Today, Linux is used in every cell-phone, most operating systems (including Windows and the Apple ecosystem), nearly all IoT devices, and almost every cloud computing system. The World Wide Web, was open-sourced early on, and nearly all contemporary browsers are freely distributed. Computer languages have been open-source for a while now, and the overwhelming number of relational and document databases in use today are open-source. We created open standards for XML and JSON communication that have mostly replaced binary protocols.
We don't hear much about FOSS in the financial press, which has largely become a vehicle for selling big IPOs and lionizing investors and corporate executives, nor do you hear a lot about open standards, unless they figure as part of a prominent company's product offerings. The reason, of course, has little to do with value and everything to do with money. This seemingly confusing statement should be explained - the tech press, in general, is dominated by financial interests, namely investment banking and venture capital firms, emphasising selling products, services, and business consulting. They want to sell IPOs.
Generative AI emerged from the world of Natural Language Processing (and Understanding) that had its heyday in the early 2010s with the rollout of Siri by Apple in 2010. At the time, the technology was too immature, and eventually Siri was quietly deprecated. In 2017 (after the financial industry was unable to get the Metaverse off the ground and self-driving vehicles proved too immature) the focus shifted to the emergence of new processing that models that took search and, rather than simply indexing properties, encoded it into a vector map that could then use clustering to generate meaningful sentences. OpenAI appeared on the radar in the early 2020s, but then the Pandemic hit, and things cooled until late 2022, at which point the hype mill went into full gear.
The goal was eventually to create a generative artificial intelligence system (most of the rest of the field was concentrating on diffusion imaging technologies) and ChatGPT 3, when it was released, looked like it was close if not there. ChatGPT 4 followed about a year later, and OpenAI became an industry darling, attractive an insane amount of investment. All of a sudden, the talking heads were all saying that programmers and other employees were obsolete, that the age of programming was dead, that a connection to OpenAI would be all you needed.
Perhaps because the hype was so extreme, perhaps because the Metaverse was still relatively fresh in people's minds, perhaps because the social implications of ChatGPT were so negative, a quiet backlash turned into a tsunami this week, and as the waters finally receded, the future of centralized intelligence lay among the detritus.
It is time for another approach.
Why AGI Doesn't Work
There are many problems with the current approach to AI. Some of these have to do with the fact that hallucinations seem increasingly to be not just a bug, but a fundamental flaw of the system. There are a number of reasons for this, mostly dealing with the nature of latent spaces, but the upshot of this is that an LLM, a confabulator, is better at making stuff up than it is in being a database by itself. If you're generating imagery or video, making up stuff can be quite useful. If you're trying to build a super intelligence, making stuff up can be ... problematic.
I have, for more than two years, been making the argument that you need several things to help stabilize confabulators. One of them is a knowledge graph (KG), preferably one with a working schema and taxonomy. This doesn't completely eliminate hallucinations, but what it does do is allow you to pass both of these either via RAG or via ingestion in a way to map narrative structures into discrete, identified concepts. This approach would eventually become known as GraphRAG (graph Retrieval Augmented Generation), and seems to be one that the industry as gravitating toward.
The other is to make Large Data Models ... smaller. This may seem counterintuitive, but I believe there is a solid rationale for it. Up to a certain size, as you add content to an LLM, what you are doing is creating a latent space. - narrative structures that are mapped into chaotic vectors, then reduced in dimensionality to a certain extent in order to make the mathematics feasible. Imagine informational cobwebs with embedded dust bunnies, and you get a pretty good idea of what latent-spaces look like. The input order then of the prompt then starts you down certain narrative conversations - paths within these cobwebs, generating the output seemingly magically.
However, the broader the LLM training set, the more that the path being returned is going to "get lost", tracking a sequence that becomes nonsense. This is especially true with zero shot (small prompt) queries where the reponses simply does not have enough information to provide meaningful output. Statistically this is called overfitting, and while LLMs do not completely follow normal statistical models (they are chaotic, not random), The more specialized the language model, the more likely that the information being retrieved remains relevant, and the less likely the model hallucinates, though at the cost of being poorer at roleplay (in which the LLM "thinks" it's something else, such as a database or a customer).
This is one of the reasons that I tend to harp a lot about extroversion and introversion. An extrovert, by and large, is a shallow thinker, but one who likes to believe that they are good at everything. Shallow thinking is by itself a good thing - it means that you are quick and adept at handling conventional problems, and you tend to fake it 'til you make it. Introverts (for a number of reasons) are usually deeper thinkers - they have in-depth knowledge gained from extensive experience, but this is knowledge that tends to surface by taking more time to build multiple layers of abstractions (deep graphs) in their heads. They take longer to think things through, but when they do, they are usually right. They are often thought of as insightful.
I think language models tend to reflect this duality. Facile data models have a lot of general knowledge, and are very good at comparatively simple tasks, but tend to be poor reasoners. Their graph is broad but shallow. Deep language models have to go farther in-depth, it takes longer to do tasks, but the results that come back usually reflect that expertise. In effect they use a pruning strategy to restrict what paths they take through the latent space (which is something that Deep Seek does, by the way).
So why not do both in the same system? Because the longer the path-taken, the more likely it is that it will veer off into hallucination, and because graph traversal is an extraordinarily expensive operation which adds to the size of the token context dramatically. Again, an analogy may make this more obvious. When playing chess, the brute force method is to look at the current state of the game, then create a graph that shows the board after one, two, three, etc. steps. The number of possible configurations increase combinatorially with each step, which is why most chess algorithms have both an upper limit to the number of steps it looks ahead and very sophisticated pruning algorithms. Speech is (much) more complex than chess here. Shallow patterns have fast traversal, but limited depth, deep patterns have extensive traversal but are tightly focused. Because of memory restrictions, you can't have both.
There is one other problem with broad public-facing LLMs: guardrails. The larger your audience, the less reliable they are, the more likely that someone will feel slighted by something you say, and consequently the more you have to examine and suppress errant output. That consumes huge amounts of context, which would otherwise be used for computation. This isn't just token production, but it is also algorithmic, as there are usually algorithms that are invoked for post-processing.
AGIs would be possible (barely - there are other constraints) if you had infinitely long, infinitely fast contexts and if traversals could be done to arbitrary depths at zero costs. None of those hold true in the real world. Information has latency and is subject (somewhat) to thermodynamic principles.
The centralized LLM model can then be seen primarily as needing to be a huge system where the LLM becomes a constriction or control point, with a limited amount of visibility from any client about what lays on the other side:
Decentralized LlamaGraphs and the AI Web
I'm going to propose a new type of data object called a LlamaGraph (LMG). A llamagraph marries a language model with a knowledge graph as an integrated object. There are currently prototype LMGs out there - Microsoft's GraphRag system, for instance, would probably qualify, with an Neo4J graph back end - but the idea is that you rely upon the knowledge graph for critical reasoning with known object and use the LLM primarily as a mechanism for supplementing that core with narrative content.
I see an LMG as a node that connects it in a network to other LMGs as an AI mesh. In current LLM best practices, you have a concept called a Collection of Experts (CoE). An application such as ChatGPT is not a single server. Instead, when you make a call, most ChatGPT models pass that prompt off to one or more experts, each of which are more specialized, then returns the prompt. This then gets passed off as being a single model.
Deep Seek takes this one step further by applying a filtering mechanism that suppresses those nodes that are not relevant even within the model itself. This means that Deep Seek can seem a little flat - it has less "imagination" from which to work, but it is also more likely to be correct.
Couple this with a knowledge graph that includes a suite of schemas that identify data patterns as well as potentially pre-seeding the LLM with URIs to create unique keys for retrieving conceptual clusters, and you have something that becomes a surprisingly robust data delivery mechanism that gets around both the output transformative steps and integration of heterogeneous data sources that are not LLM based.
What happens if you create such an LMG for an organization, perhaps with other LMGs handling department or topic specific content? In effect, this becomes a specific organizational intelligence. You can talk to it, get specific information about it, have it perform specialized agentic actions, and so forth. An action is simply a fancy word for a web service or endpoint, and you don't need a multi-hundred billion-dollar company to build these. Specific LMGs can even become dedicated hubs, experts at orchestrating actions within your organization.
If my LMG client can reach out to your LMG server, I can query the state of your organization, assuming I have the appropriate identity keys/access permissions. I can even set up a conversation between my LMG and your LMG, so long as I have the parameters that indicated completion of that process. If myLMG and your LMG are then involved in another conversation with their LMG. An example of this can be seen as follows:
Does this sound familiar? It should. This is the way that the web first emerged about 35 years ago. Rather than single "generalized" LLMs, you have specialized LMGs that do everything from act as service domain registers to orchestrate actions to retrieving data content of various sorts to performing analytics to handling presentation generation. From an intelligence standpoint, this is actually much closer to the way that our brains operate, where you have different actions being done by separate "specialized" brains that all work in concert.
Moreover, these can be combined with non-LMG components as need be - if you don't need a confabulator, don't use it. It's that simple.
To make this possible, it does require coordination and standards - something analogous to HTTP and HTML, albeit at a somewhat higher level of abstraction. It could also piggyback on top of the existing web infrastructure, with extensions then added as we build more complex interconnected systems.
Implications of a Decentralized Data Web
Architecture drives evolution. We have settled on a decentralized architecture for the web because it works. There is an extended history of attempting to build walled gardens on the web that restrict access outside of the scope of the web - Facebook, Instagram, TikTok, AOL, MySpace, etc. Most of these gardens don't last - they get caught up in politics, they are bought and sold, they become points of power concentration, and with it oppression. They also become more restrictive over time, as they do everything they can to keep people from moving from one such garden to another. They restrict innovation and experimentation, and often become mechanisms for illicitly extracting value from the platform's members.
The idea of a decentralized data web is far from new. Indeed, it can be argued that Linked Data, introduced by Tim Berners-Lee and others in 2000, represented a first step in the declaration of a data web. More recently, the Solid initiative emerged as another approach (pre-GenAI) for both creating and managing data nodes in a broad scale mesh network.
Solid, by itself didn't quite succeed - the interoperability aspect proved problematic, as did the fact that most data is about something rather than produced by something, but there are many different pieces that can be solved by moving to a decentralized data web. For instance, verifiable credentials and decentralized identifiers could sit at the foundation for a permissions supernetwork, ensuring that sensitive information was basically manageable by the people it most directly impacted. True agentic systems, in which autonomous agents are able to interact dynamically with the data context around them, also become feasible, and it means that you can even create LMGs that can reside in a car, a phone, a camera. They may not necessarily be very smart, but they don't have to be - they just need to be smart enough.
My anticipation is that centralized LLMs are not going away, but they will, instead, simply become large-scale cloud services that people can use because they provide easy of use or specialized data access. This is fine: it mirrors the way that the web has evolved today. However, I think that LMGs (or whatever they will end up being called) will fill in the ecosystem, such that organizations and even individuals have intelligences that they can call upon.
And of course, there's always the possibility of having intelligent flying toasters, as IoT meets AI (okay, not very intelligent, admittedly - cue the Portal turret references).
In media res,
Editor, The Cagle Report
If you want to shoot the breeze or have a cup of virtual coffee, I have a Calendly account at https://calendly.com/theCagleReport. I am available for consulting and full-time work as an ontologist, AI/Knowledge Graph guru, and coffee maker.
I've created a?Ko-fi account?for voluntary contributions, either one-time or ongoing. If you find value in my articles, technical pieces, or general thoughts about work in the 21st century, please contribute something to keep me afloat so I can continue writing.
Data Engineer Architect Python, Java, SQL, C/C++, Databricks, Spark, Rust, OCAML, RDF. Northern California/Silicon Valley/San Francisco.
1 个月I agree
Get your AI #Graphwise
1 个月There was a time +-1 year after the tech launched that we were promised domain-specific models. I haven't seen them. I guess RAG disrupted the idea since it's cheaper than each company doing it themselves? On DeepSeek, I recall a whitepaper from last Spring/Summer where China reported to have significantly curbed LLM hallucinations via a kind of self-asserted provenance, where the process of calculating a response included collating the specific content from which each one was generated along the chain leading to the final prompt response. That's probably a bad description of it. I didn't know you back then, so I'm wondering what your take is on it, if you recall.
Spatial Web / AI Governance / Supply Chain / Smart Cities / Autonomic & Automated Systems / Urban Mobility / AR/VR
1 个月Fascinating and thoughtful post Kurt! I agree enthusiastically with your vision of a decentralized web of searchable knowledge concepts and data that expert agents can use to make precise decisions and share with other agents.
Exploring AI-driven Value l LLM Prompt Engineering
1 个月distributed AI is unnecessary in this context. The only essential components for a secure and decentralized system are: 1. Personal Data Protection – Encrypted storage of user data, including AI weights. 2. Distributed Storage Network – Secure, decentralized data storage where only the owner holds the decryption key. The main problem with decentralized AI is that it can be permanently lost if stored on an isolated device. Expanding beyond these two principles only adds unnecessary complexity without tangible benefits.
Associate Teaching Professor at University of Missouri - St. Louis
1 个月Kurt, you say: "Facile data models have a lot of general knowledge, and are very good at comparatively simple tasks, but tend to be poor reasoners. Their graph is broad but shallow. Deep language models have to go farther in-depth, it takes longer to do tasks, but the results that come back usually reflect that expertise. In effect they use a pruning strategy to restrict what paths they take through the latent space (which is something that Deep Seek does, by the way)." How would you say this differs from the symbolic AI era of knowledge representation in which Breadth First Search (using Frames) and Depth First search (using forward or backward rule chaining in Expert Systems) were paradigmatic choices for development? Also, from some very brief research, it appears that the distillation technique used in DeepSeek does not filter for hallucinations when it "learns" from a mentor LLM. Any thoughts on that concern?