登录查看更多内容

From gen AI to next-gen AI

Mike Dillinger, PhD

发布日期: 2024年3月15日

For all its successes, both hallucinated and real, gen AI is still architected on foundations with unresolved gaps, design flaws, and inherent weaknesses.? A deeper understanding of these flaws will help us design and build next-gen AI that's more reliable, more accessible, more?manageable -- and a better investment for clients.

From one perspective, there are four main families of issues with gen AI, some of which receive lots of attention and are well-known while others are buried in the rubble of the foundations themselves. A more detailed list of issues with LLMs is covered here .

Engineering Problems

In spite of the growing availability of tools for leveraging existing LLMs, building broad coverage LLMs themselves continues to be a huge challenge, accessible only to the largest organizations.? Pre-training time, cost , and infrastructure are shockingly high. Large-scale batch pre-training leads to issues of data freshness. Fine-tuning existing models and hosting them for low-latency deployments also require lots of development time and costly infrastructure. And because the technology is so new, trial-and-error approaches to development lead to even more time for deployment, along with much higher costs.

One key underlying issue is that the usual gen AI approach depends crucially on models from a very-high dimensional design space .?

Models in gen AI are conceived of in token space: each input token or subtoken, along with its n-way interactions with other tokens, is tracked and modeled.?

Even if we reduce the token space to the 50,000 most frequent ones (and ignore all the other tokens – a common practice), na?ve tracking of only the pair-wise co-occurrences would require 2.5 billion parameters. Three-way, four-way, and higher co-occurrences quickly multiply that to astronomical values.

For these reasons, among others, a significant change is in the works.

Next-gen AI is likely to shift to a smaller, more streamlined concept space for designing and training models, instead of a token space.

Think of it as the creation of a layer of AI "above" LLMs – a cognitive or conceptual layer. One crucial difference is that in a concept space systems will model abstract concepts – not the hundreds or thousands of nearly-synonymous observable string tokens that we might use to communicate each concept. Rather than ingesting random raw text on a mind-boggling scale, we will ingest pre-structured, information-dense concepts structured and defined in knowledge graphs, ontologies, taxonomies, dictionaries, encyclopedias, and glossaries on a much more manageable scale – most likely as a separate "modality" from text – as we do already with images, audio, and video for multimodal LLMs . In this approach, a language model will continue to be an essential and effective add-on that powers the API rather than trying to execute core reasoning processes.

This shift will yield something like a 1000x reduction in the design space, since we can easily find 1000 synonyms or translations (often more!) for the same concept in a large multilingual corpus of text tokens.? Using the same machinery that is available now, we will be able to predict the next concept instead of the next token. At this more compact scale, interactive (even real-time) fine tuning becomes much more feasible, as does detailed domain adaptation – which helps drive down development time and infrastructure costs.? This shift, in turn, will also have major consequences for the other families of issues that plague gen AI , as sketched below.??

Quality Management Problems

Current LLMs – and the systems based on them – face nearly insurmountable challenges of quality management.??

We have few, if any, reliable tools and processes for tracking and troubleshooting issues in training data, in output quality, in compliance, in rights management, in version control, and in modeling dependencies. Relying on human ground truth, i.e., checking only tiny samples by hand, is clearly not enough. We've barely begun to deploy gen AI and the news is already full of stories of LLMs generating responses that create unwanted financial obligations for their creators, of hallucinations that damage their brands, of litigation over rights to training data – and more legislation on the horizon will create a broad range of additional challenges for quality and provenance management.?

On top of that, published training runs and experimental results on gen ai systems are most often not replicable – neither within an organization nor across companies. No one seems to know how reliable gen AIs are or why they work when they do.? This ill-defined and poorly documented trial-and-error development method leads directly to skyrocketing development costs and unpredictable deployment timelines. These are key reasons – mostly tied to unclear ROI – why many companies hesitate to embrace gen AI .

领英推荐

?? AI’s diminishing marginal returns

Azeem Azhar 7 个月前

237 | Breaking Analysis | Gen AI is Passe’, Enter the…

David Vellante 4 个月前

?? AI as a creative partner

Azeem Azhar 5 个月前

Market pressures will relentlessly force next-gen AI to shift to more transparent and structured quality management – away from current black box models–, with explicit, automatable evaluation criteria and significantly more detailed documentation.?

This change will enable more systematic, predictable development processes and much more reliable "ready for production" decisions. A renewed emphasis on data quality and the shift to concept space models will both enable and accelerate this change by improving transparency, reducing model size, and creating a well-documented conceptual vocabulary for evaluation – and perhaps by making the legal link to copyrighted text less contentious.

User-Control Problems

LLMs are the ultimate API for humans to "drive" AIs – they already do a great job of enabling interaction through messy, variable human language.?

But the prompts that elicit LLM output are very brittle – relying on token models means that LLMs will often produce wildly different responses for queries that are synonymous or functionally equivalent for humans.

And relying on frequency of token mentions for training means that LLM performance is degraded – by design – for "smaller" languages and less-frequently-discussed topics. The pervasive need for extensive prompt engineering documents these weaknesses, increases development costs, and delays roll out of reliable systems – and, I suspect, is a throwaway effort that will see very little later reuse. News of the death of prompt engineering may be exaggerated, but it is already clear to many that it is by no means a sustainable approach.

Next-gen AI will likely be coerced into leveraging concept-space models like those in knowledge graphs and ontologies to "translate" prompts into a more abstract, more reliable form – rather than into a region of similar-token space as in vector databases.?

There is already considerable hard evidence that Knowledge Graphs and ontologies improve every step of the LLM development and deployment process.

Moving to concept spaces – semantic or cognitive layers on top of LLMs – will make LLMs less dependent on frequency of token mentions and on web-scale corpora during training, will enable more transparent quality management, and will ensure less ambiguous, more streamlined user-system communication.?

Conceptual Problems

Today's LLMs are very new additions to our cognitive repertoire, so we haven't yet had many opportunities to think about them carefully and systematically.? As we do consider their foundational assumptions in more detail, we inevitably find mismatches between the marketing hype, the engineers' na?ve assumptions , and the carefully curated concepts from other domains. These mismatches inflate expectations, confuse consumers, disappoint investors, and block the interdisciplinary collaboration that would otherwise accelerate progress.

One of these mismatches is the use of terms like understanding and reasoning to describe the internal operations of LLMs. LLMs were designed and developed to generate natural-sounding sentences and paragraphs – NOT to understand sentences or to reason about the world.? And in fact they do a spectacular job of generating sentences. But naturally, they do not perform well on a wide range of reasoning tasks – in fact, it's surprising that they accidentally (not by design) do produce correct responses for many of them – that's why some people talk about emergent (in this case, unexpected) properties of LLMs and call that "reasoning".?

Among researchers who focus on processes like understanding and reasoning in humans, i.e. cognitive psychologists or cognitive scientists like me, these processes don't happen or even exist without significant involvement of not-directly-observable concepts that are separate from (and have very different characteristics from) the directly observable tokens or words that we use to communicate them.? Understanding , on this view, is the process of creating a mapping from visible tokens (or gestures, or other things) to unobservable concepts – without concepts, there is no understanding .? And reasoning, on this view, is the process of manipulating concepts without consideration of the tokens that we might use to communicate them. Reasoning abstracts away from words to relate concepts directly, compare them, add details or relations, evaluate their coherence or evidential base, etc.? Because LLMs manipulate only tokens and have no identifiable representations of concepts or types of concepts – that is, no ostensible semantics –, claims about how well LLMs "understand" or "reason" create conceptual mismatches that confuse developers, investors, and clients alike. This is not to say the LLMs aren't useful; they clearly are – but most of them don't understand or reason in any technical sense – just like clocks don't "know" the time and calculators don't "know" math.?

Next-gen AI is likely to capitalize on the shift to concept space models to alleviate these conceptual mismatches and make the capabilities and value-add of AI systems more transparent for both developers and other stakeholders.? Once AI systems map reliably between token spaces and concept spaces (a long-standing focus of many natural language understanding researchers), then the parallels with human understanding and reasoning become much clearer. Current efforts to build multimodal LLMs are making good progress in this direction:? image generators like Stable Diffusion and Midjourney model language tokens separately (a token space), pixels separately (a concept space, with concepts represented as patterns of pixels), and the mappings between them, as well.?

The already significant impact of explicit concept stores like knowledge graphs and ontologies at every step in the development and deployment of LLMs is an important indicator of systems to come.

Knowledge Architecture

3,303 位关注者

Daniel Lundin

Head of Operations at Ortelius, Transforming Data Complexity into Strategic Insights

5 个月

Interesting topic, I interpret the "finger test" as a variant of "where do you store your business logic and business rules that drive decisions?". And the follow up of, which of these rules/logic were used to come up with this recommendation. Do you agree or have I missunderstood?

Jérémy Ravenel

?? Building @naas.ai, universal data & AI platform to power your everyday business

8 个月

Definitely aligned next gen AI is about KG and ontologies

1 次回应

Loredana Sundberg Cerrato

Dynamic Project Manager & Leader | Expert in AI Solutions for Healthcare | Proven Track Record in Academia & Corporate

8 个月

Thanks Mike, very interesting articleIt is really striking that Gen Ai is based on Trial and Error development methods. This is quite worrying!

Nari Kannan

8 个月

Surendran Sukumaran Balaji D Loganathan

Dave Duggal

Founder and CEO @EnterpriseWeb

8 个月

Mike Dillinger, PhD - Love your methodical build and continued pursuit. Neuro-Symbolic AI is the future, it's just taking awhile for the Neuro folks to catch-up with the "concepts" ; ) I published a related post that attempts to make the case clear to a general business audience to help folks cut through the hype - https://www.dhirubhai.net/feed/update/urn%3Ali%3Aactivity%3A7174476355919638528/

2 次回应

查看更多评论

要查看或添加评论，请登录

Mike Dillinger, PhD的更多文章

Knowledge Graphs are Essential for Safe AI

2024年11月11日

Knowledge Graphs are Essential for Safe AI

AIs will only be safe for general use when they have and use goals and values that are identical to those of humans. In…

27 条评论
Knowledge graphs, Linguists, and the Last-mile problem of AI

2024年11月4日

Knowledge graphs, Linguists, and the Last-mile problem of AI

Now that AI can generate fluent text at scale in multiple languages and different styles, are authors, translators…

21 条评论
Audio: How to make AI safe and reliable?

2024年10月21日

Audio: How to make AI safe and reliable?

Janie and Johnny are back for Episode 2 of my Byte-sized AI series! Listen in to these engaging, bite-sized podcasts to…
Audio: What are Knowledge Graphs?

2024年10月1日

Audio: What are Knowledge Graphs?

Who knew? It seems that Max Headroom had blue-eyed twins and they're all grown up! I suspect that he sent them to…

10 条评论
Entity Resolution: Priority #1 for Building Real Knowledge Graphs

2024年9月6日

Entity Resolution: Priority #1 for Building Real Knowledge Graphs

I keep seeing mentions of "entity-resolved knowledge graphs", which leads me to believe that other so-called…

33 条评论
Google's Semantic Search: Going to the Dogs?

2024年8月26日

Google's Semantic Search: Going to the Dogs?

Google is the undisputed leader in web search – technically a monopoly in fact. The coverage of web properties (good…

41 条评论
Spelling-driven Reasoning in LLMs

2024年8月2日

Spelling-driven Reasoning in LLMs

If the lighting is just right, and you squint just enough then cock your head to one side, you might say that…

37 条评论
Stuck in the Muck: Big Data means Big Problems

2024年7月31日

Stuck in the Muck: Big Data means Big Problems

Imagine that your organization is a sleek thing of beauty, like a very fast, very expensive, highly polished Ferrari…

20 条评论
Better Knowledge for Better AI

2024年7月24日

Better Knowledge for Better AI

There's a growing consensus that knowledge graphs – which are a kind of artificial knowledge for artificial…

13 条评论
Psychological Foundations of AI

2024年7月22日

Psychological Foundations of AI

Artificial Intelligence is the dark (and for some, impenetrable) art of making machines think with hardware instead of…

21 条评论

See all articles

From gen AI to next-gen AI

Mike Dillinger, PhD

领英推荐

Knowledge Architecture

3,303 位关注者

Mike Dillinger, PhD的更多文章

社区洞察

其他会员也浏览了

Artificial Intelligence News - 28th June, 2024

Streamlined decisions | Systems from the AI dystopia series

Understanding the Generative AI Agent Framework

Self-Host Generative AI in Disconnected Environments with LeapfrogAI

Framing the right problems for AI to solve

The AI Renaissance Is Here!

AI Insights - September edition

Homebase to Horizon: 300 Days of Pioneering AI with Specialized GPTs

Mastering AI Success with Autodistill – Your Complete Guide with Use Cases

Knowledge Graphs: The Backbone of AI-Driven Insights

领英推荐

Knowledge Architecture

3,303 位关注者

Mike Dillinger, PhD的更多文章

Knowledge Graphs are Essential for Safe AI

Knowledge graphs, Linguists, and the Last-mile problem of AI

Audio: How to make AI safe and reliable?

Audio: What are Knowledge Graphs?

Entity Resolution: Priority #1 for Building Real Knowledge Graphs

Google's Semantic Search: Going to the Dogs?

Spelling-driven Reasoning in LLMs

Stuck in the Muck: Big Data means Big Problems

Better Knowledge for Better AI

Psychological Foundations of AI

社区洞察

其他会员也浏览了

Artificial Intelligence News - 28th June, 2024

Streamlined decisions | Systems from the AI dystopia series

Understanding the Generative AI Agent Framework

Self-Host Generative AI in Disconnected Environments with LeapfrogAI

Framing the right problems for AI to solve

The AI Renaissance Is Here!

AI Insights - September edition

Homebase to Horizon: 300 Days of Pioneering AI with Specialized GPTs

Mastering AI Success with Autodistill – Your Complete Guide with Use Cases

Knowledge Graphs: The Backbone of AI-Driven Insights