AI/ML/DL/LLM's Paradigm Shift: World Ontology and STEM are all we need
The success of Generative AI & LLMs & Foundational Models has generated much interest in creating real and true AI and machine intelligence and learning.
Trained to do a wider range of tasks?(such as text synthesis, image manipulation and audio generation)?today's AI/ML/DL/LLMs are a narrow and weak A, while commercialized as a general-purpose AI or GPAI system, where GPT-x series included
To become AI as such, the present narrow and weak AI/ML/DL/LLMs need a radical upgrading and revision, a paradigm shift towards truth and reality, as a major change in the worldview, concepts, and practices of how AI works or is accomplished.
It is argued that Science and Technology, Engineering and Mathematics (STEM), structured by Global Computing Ontology (OSTEM), it is what we need for creating a general-purpose intelligent technology as scientific and knowledge-based AI models, systems and applications:
true discriminative and generative AI
causal and objective ML
interactive DL Neural Networks, as LLMs and Foundational Models
AI and ML field has experienced several hype cycles, followed by disappointment and criticism, funding cuts and disinterest for years or even decades later, as in 1966-69 for ANNs, 1974–1980 for experts systems and 1987–2000 for symbolic, logical AI (the US SCI or Japan's Fifth Generation Computer).
Since 2010-2012, interest in AI and ML from the research communities and big tech corporations caused a dramatic increase in funding and investment, leading to the current (as of 2023) AI boom, with the generative AI & LLMs & Foundation Models race being a key driver of this hype.
We show how the world knowledge encoded as the science at large, philosophical and formal, natural and social, technological and engineering, with its global computing ontology could help avoid reversing the current AI spring into the last AI/ML "nuclear winter".
The Secrete Code of AI, ML, DL, and LLMs
There is an article, Attention Is All You Need, which, together with GPT-n series and NVIDIA's stock prices, prompted Generative AI, Foundational Models and Large Language Models to "The Peak of Inflated Expectations" in the Garter's Hype Cycle for AI, 2023.
Since the introduction of Transformer model in 2017, Generative AI & LLMs have evolved significantly. ChatGPT saw 1.6B visits in May 2023. Meta also released three versions of LLaMA-2 (7B, 13B, 70B).
Meantime, being in the middle of the AI Hype Cycle, Generative AI & LLM is falling to the Trough of Disillusionment.
Many hope that "Gartner Hype Cycle is just a really useless, wrong and a unhelpful representation, which plays no relevant role in the modern technology landscape".
Despite product and business model innovation and expected productivity boom from Generative AI & LLMs, a real-world ROI has been concentrated around predictive data analytics techniques and statistical classifiers, as utilizing tabular statistical datasets and tree-based methods such as classification and regression decision trees, XGBoost or Random Forest.
It is not the statistical learning algorithms as in the Classical ML or the data as in the Data centric AI or the compute or the attention mechanism, training datasets as in the GPT models is the secret sauce of AI/ML/DL — including LLMs.
It is science at large with its global computing ontology, as the sum of universal knowledge, the world knowledge, as of objects and properties, noumena and phenomena, data and facts, patterns and regularities, laws and principles, methods and models, causes and effects, interactions and networks.
LLMs and Foundational Models and Global Computing Ontology
Ontology is formal representation of the world at large while ontologies are formal representations of a world domain.
They provide the basic knowledge about the world's or domain's entities and interactions, axioms and rules, all possible objects and properties, structures and functions, elements and relationships, providing an intelligence framework enabling a deep understanding of that domain or subject area or universe of discourse.
Global ontology and applied ontologies are used to enable machine knowledge, learning, inference and interaction, and meaningful understanding, allowing an AI system to draw inferences, to explain conclusions and decisions, to discover new findings, to predict and derive new information and relationships between entities.
LLMs are ML foundational models that aim to generate human-like responses (including text and images, audio or video) based on an input (“prompt engineering”). They are trained on a large corpora of scraped internet/web data, as texts with trillions of words, computing the patterns and connections between words and images.
领英推荐
A LLM like ChatGPT or LLaMA activated to its digital life via pre-training in general language or domain-specific domains, instead of using the general computing ontology or domain-specific ontologies to build real pre-trained base models or task-specific pre-trained models.
It is added with fine-tuning and reinforcement learning from human feedback (the “make AI” process), and the “use AI” — inference process. In the full LLMs/Generative AI stack, there are observability, guardrail, governance and model safety levels.
The LLMs pre-training process is resource-, capital-, compute-, time-, and memory-intensive. Compute hardware to support AI training workflows, to find the optimal weights, its sparse matrix multiplications are distributed to tens of thousands of GPUs. GPT-3 with 175B parameters requires 3e23 FLOP (Floating Point Operation) of computation for training. For the most available SOTA GPU, Nvidia A100, which can conduct around 312 TeraFLOP per second (TFLOPs), a single GPU needs 30 years to train GPT-3.
Besides, on-chip memory size, like the newest AMD MI300x (which has 2.4x memory of Nvidia’s H100), is growing at a slower pace compared to the model size (directly contributes to weight size), as GPT-4-like multi-modal models, which could contain text, image, voice, or video capability, resulting in more complex memory requirements.
Again, LLMs or Generative AI for businesses will not work unless the underlying data it leverages is trustworthy.
Hence, scraping an enormous “web data base”, it is essentially incorrect and/or biased and hallucinating, generating false information.
LLMs generate new content based on their training data, but don't really know the sense and context, semantics or relationships in that content, thus failing to make sense of the world for effective and sustainable interactions.?
Last, not least, AI/ML/DL Models are actually very vulnerable, possessing unsafe or dangerous or harmful information, and being subject to adversarial attacks such as prompt injection not mentioning hallucination. See example below.
Ontology and STEM are all what we need
To build hyperintelligent machines, we have to teach them the world knowledge or how to interact with the world, transforming artificial neural networks into the world hypergraph interaction networks.
Real/True AI = Computing Ontology + World Knowledge (Universal Technoscience (UTS): {Philosophy, Science, Technology, Engineering, Mathematics} + Generative AI & LLMs & Foundational Models > Universal AI Platform
The Generative AI as the foundation model or base model is actually a misnomer popularized by?the Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM).
In fact, it has no world models but Infinite Data, Infinite Neural Networks, and Infinite Compute Power, with no world knowledge or Intelligence or real and true AI.
Conclusion
We explored the similarities and distinctions between global computing ontology and applied ontologies and Generative AI & LLMs, as well as how they can be effectively integrated together, for have the safe, transparent, and secure development of general-purpose AI technology.?
Resources