A quantum leap from LLMs to LWMs, or how GenAI could navigate the ‘trough of disillusionment’
The new RAND report based on many interviews of technical exports, data scientists and engineers, finds out that > 80% of AI/ML projects fail. They identified 5 root cause failures of AI projects, from to the problem misunderstanding to its technology solution complexity.
Besides, in its latest Hype Cycle for Emerging Technologies chart, 2024, falling into four key areas: autonomous AI, developer productivity, total experience, and human-centric security and privacy programs, Gartner put gen AI in the “Trough of Despondency”.
Given the whole downward trend, we are developing the world modeling recommendations to make AI/ML projects to succeed in personal, government, academic, or industry environments.
Big Tech AI/ML/GenAI/LLMs: Change or Perish
All AI owned by Big Tech , Nvidia, Google, Apple, Facebook, Amazon, and Microsoft; Baidu, Alibaba, and Tencent, be it big data narrow cloud AI platforms, ML algorithms, DL models, Generative AI, Large Language Models, as various chatbots, is lacking reality, being unable for real-world problem-solving.
All Big Tech's GenAI is "to change or to die", being in need of a quantum leap from its Large Language Models (LLMs) to Large World Models , all integrated by Large Causal World Models (LCWMs = World/Reality AI Engine).
LCWMs is the future of AI, as true and real, non-human generalist machine intelligence, extending beyond data, text, audio, images, codes, with statistical predictive analytics, to include the entire spectrum of reality, physical, mental. social and digital realities. LCWM will process real-world data from various sources, such as scientific instruments, space probes, IoT devices, sensors, cameras and more, to comprehend and interact with the world in the most rational and efficient, dynamic and real-time ways.
Real AI is a world's information processing and scientific modelling & simulation & understanding machine?that produces?models?representing?reality, its entities and?objects, phenomena, physical processes, patterns and relationships
Thus it models, defines, quantifies,?visualizes, simulates, and understands some domains, parts or features of the world, integrating different types of scientific models , such as?"conceptual models?to better understand, operational models to?operationalize,?mathematical models?to quantify,?computational models?to simulate, and?graphical models?to visualize the subject"
Real, scientific AI models apply different scientific models and algorithms to the data inputs to achieve the output, autonomously making discoveries or explanations, decisions or predictions, simulating reality and its contents, rather than simulating human intelligence.
Below is a schematic example of how the Real/Causal/Scientific AI models could model chemical and transport processes related to atmospheric composition without increasingly challenging amounts of data and computing power to train and execute statistical AI/LLMs.
It covers both rules-based, symbolic programed AI and statistical, computationally trained AI models , as ML/DL/NNs/GenAI model techniques: supervised learning, unsupervised learning and reinforcement learning.
LCWMs empower AI/ML/LLMs machines to comprehend and interact with the world with all scientific depth and causal insight, integrating human sense data and non-human environmental sensors, infrared, radars, thermal scanners and other IoT data.
Thus LCWMs are enabling real AI systems with real intelligence, learning, inference, prediction, decision-making and interactions, thus paradigm shifting from understanding the world through digital NL data as text to reality in all its complexity.
The applications of LCWMs are vast and varied, covering virtually every parts of human life and sector of society, from personal life to deep space exploration.
From AI Hallucinations to the Trough of Disillusionment
Now let's return to our seep. After rising nearly 200% in the first six months of the year, Nvidia saw its share price fall 20% in two months tracking Gartner's gen AI in the “Trough of Despondency” in its latest Hype Cycle chart .
Excitement around genAI's Large Language foundation models, such as Google Gemini, Anthropic Claude, Amazon Bedrock, and OpenAI GPT-4, is waning among enterprises as companies instead seek concrete returns on investment (ROI) .
Along with?OpenAI’s GPT-3 and 4 LLM, popular LLMs include open models such as Google’s?LaMDA?and?PaLM?LLM (the basis for Bard),?Hugging Face’s BLOOM?and?XLM-RoBERTa,?Nvidia’s NeMO LLM,?XLNet,?Co:here, GLM-130B, Google’s new PaLM 2 LLM.
A number of genAI startups are hard to count.
领英推荐
Large language models are the algorithmic basis for chatbots as mentioned above, tied back to billions — even trillions — of parameters that can make them both inaccurate and non-specific for vertical industry use.
Here's what LLMs are and how they work. A LLM is a computational model capable of language generation or other natural language processing tasks, acquiring these abilities by learning statistical relationships, instead of realistic, objective or causal patterns, from vast amounts of mostly biased data/text during a self-supervised and semi-supervised training process.
These models are missing causal?predictive power?regarding syntax, semantics, and?ontologies.
They are sold as machine learning models that can comprehend and generate human language data by analyzing massive data sets of language.
The GPT's LLM is a computer algorithm that processes NL inputs and compute the probabilities of the next word based on what it’s already trained or rote learnt. Then it predicts the next token/word, and the next token/word, and so on until its answer is complete.
As such, LLMs/AI are trained on a massive amount of unlicensed data, articles, Wikipedia entries, books, internet-based resources and other input to produce human-mimicking responses to prompted NL queries.
Acting as ML/NNs trained through infinite data input/output sets; information content is entered, into the LLM, and the output is what that algorithm predicts the next word will be. The input can be proprietary corporate data or, as in the case of ChatGPT , whatever data it’s fed and scraped directly from the internet.
Training LLMs to use the right data requires the use of massive, expensive server farms that act as supercomputers, now needing nuclear energy sources for further infinite scaling.
Besides, the AI "hallucinations in language models are not just occasional errors but an inevitable feature of these systems, stemming from the fundamental mathematical and logical structure of LLMs, It is, therefore, impossible to eliminate them through architectural improvements, dataset enhancements, or fact-checking mechanisms."
No World Models, No Real Intelligence and Learning
Due to the absence of comprehensive computational, "mental" world models, as LCWM, all the mentioned generative AI tools, LLMs and chatbots, "perceive patterns or objects that are nonexistent or imperceptible to human observers, creating outputs that are nonsensical or altogether inaccurate". hallucinating nonexistent patterns or features, false data or misinformation.
Such AI hallucinations have significant consequences for real-world applications, making all the big tech AI unreliable, untrustworthy and harmful, ending its fate in the Trough of Disillusionment without being properly transformed into LCWM.
The quantum leap to LCWM = the World/Reality AI Engine is not just an innovation but a constructive disruption in emerging digital technology; enabling real intelligent machines that understand and interact with the world, humans and information in the most rational, ethical and comprehensive ways.
Resources
If we’re not careful, Microsoft, Amazon, and other large companies will leverage their position to set the policy agenda for AI, as they have in many other sectors.
As Large Language Models become more ubiquitous across domains, it becomes important to examine their inherent limitations critically. This work argues that hallucinations in language models are not just occasional errors but an inevitable feature of these systems. We demonstrate that hallucinations stem from the fundamental mathematical and logical structure of LLMs. It is, therefore, impossible to eliminate them through architectural improvements, dataset enhancements, or fact-checking mechanisms. Our analysis draws on computational theory and G?del’s First Incompleteness Theorem, which references the undecidability of problems like the Halting, Emptiness, and Acceptance Problems. We demonstrate that every stage of the LLM process—from training data compilation to fact retrieval, intent classification, and text generation—will have a non-zero probability of producing hallucinations. This work introduces the concept of "Structural Hallucinations" as an intrinsic nature of these systems. By establishing the mathematical certainty of hallucinations, we challenge the prevailing notion that they can be fully mitigated.
,