AI Foundation Models. Part II: Generative AI + Universal World Model Engine
Find out the fundamentals about real artificial intelligence and true machine learning, what is the nature of generative AI, its basics and building blocks, and how it could become real intelligent.
To build truly intelligent machines, we have to teach them the universal world model and data ontology (UFO), or how to interact with the world, machines and humans, basing AI/ML/DL/GPT models on the world hypergraph interaction networks.
The essence/"brains" of all AI is not deep neural networks (DNNs) but universal world model engine and data ontology, which is an implementation of universal formal ontology of all reality (a universal AI classifier) encoded as the world hypergraph interaction networks.
A universal AI classifier as the master algorithm is classifying all the prime entities in the world explaining, discovering and predicting its causal regularities, interactions and structures. It covers all meaningful special classifiers, deterministic and logical, statistical and probabilistic, as AI models, ML algorithms or DNNs.
Introduction
A special interest attracts the so-called foundation model or base model, a misnomer popularized by?the Stanford Institute for Human-Centered Artificial Intelligence's (HAI) Center for Research on Foundation Models (CRFM).
In fact, it has no world models but Infinite Data, Infinite Neural Networks, and Infinite Compute Power, with no knowledge of Intelligence or AI.
Trained to do a wider range of tasks?(such as text synthesis, image manipulation and audio generation)?unlike a narrow AI, it is commercialized as a general-purpose AI or GPAI system, where GPT-x series included.
Here is the Generative AI Infrastructure Stack basing on Large Language & Foundational Models:
Real AI Models vs. Unreal AI Models: making all the difference
AI (Artificial Intelligence) refers to the intelligence demonstrated by machines. There is real and true, objective and scientific AI and irreal and false, subjective and nonscientific AI as different as General Global AI Models vs. Narrow Specialized AI Models.
The Unreal AI models are all about making computers and machines learning, reasoning or make decisions like humans, replicating human body/brain/intelligence/behavior/tasks.
The Real AI Models is is all about making computers and machines learn, inference, act and react, simulating and modelling directly reality itself, its entities, changes and interactions, laws, rules and patterns, to effectively and sustainably interact with the world.
If we have the TIME100 Most Influential People in AI, which is a fake, false or unreal AI. A side not. This Time is something, first it has published EJ's hysterical forecast "we are all gonna to die" if AGI is to come without his alignment. Now, we have this listing of fake AI models.
The Best Description of Unreal AI
Arti?cial Intelligence (AI) is a way of making software think intelligently, in a similar way the intelligent humans think. AI attempts not just to understand an intelligent entity, and the way it perceives, understands, predicts, and manipulates a world far larger and more complicated than itself, but also attempts to build one. AI discipline studies the way human brain thinks, and the way humans learn, decide and work, while trying to solve a particular problem. The outcomes of this study are used as a basis for developing intelligent software and systems.
Two major goals of AI summarize what scientists and researchers aim to achieve today:
Create expert systems i.e., systems that exhibit intelligent behaviour, learn, demonstrate, explain, and advice their users.
Implement human intelligence in machines i.e., create systems that understand, think, learn, and behave like humans.
The traditional AI research problems include but not limited to autonomous agents, reasoning, knowledge representation, planning, learning, natural language processing, perception, social intelligence, object/face recognition.
AI approaches to solve problems include but not limited to statistical methods, computational intelligence, and symbolic AI. Several methods and tools are used in AI, including search and mathematical optimization, arti?cial neural networks, machine learning, conceptual modelling, knowledge/ontology engineering, and methods based on statistics, probability and economics.
AI relies on computer science, mathematics, psychology, linguistics, philosophy and many others. In the 21st century, AI techniques have experienced a resurgence, following concurrent advances related to the Internet/WWW, chipset size, computer power, volume/velocity/variety/veracity of data, social networking and theoretical understanding.
AI techniques support researchers to solve many challenging problems in computer science and social life. AI applications are relevant to almost any intellectual task. Modern arti?cial intelligence applications are pervasive and numerous.
Frequently, when an application/ technique of AI reaches mainstream use, it is no longer considered arti?cial intelligence.
Some of the most widely used AI applications are included in domains such as:
? healthcare (e.g., knowledge-based diagnosis, medical personal assistants for elderly, decision-support systems for cancer treatment, robotic surgery),
? automotive (e.g., driverless cars, distributed multi-agent coordination of autonomous vehicles),
? ?nance/economics (e.g., fraud and ?nancial crime detection, AI-based buying and selling platforms),
? gaming(e.g., dynamic purposeful behaviour in non-player characters(NPCs), path?nding, deep learning prediction),
? military (e.g., weaponized autonomous drones),? security (e.g., speech/image/object/face-recognition),? advertising (e.g., predicting the behaviour of customers),
? art and culture(e.g.,e-auctions,automatedstory-tellingcreation,digitalmuseumguide,automated music synthesis),? social life (e.g., digital personal assistants, smart homes).
Arti?cial General Intelligence (AGI) is an emerging AI research ?eld aiming at the development of “thinking machines”. AGI society1 describes those machines as general-purpose systems with an intelligence comparable to human intelligence (and perhaps beyond human intelligence). While this was the original goal of AI, the mainstream of AI research has focused on domain-dependent and problem-speci?c solutions. [Arti?cial General Intelligence and Creative Economy]
Generative AI as a Foundational Model and Statistical Classifier: Cons and Pros
All generative AI models belong to special type of statistical classifiers, a generative classifier, while a classifier based on a discriminative model and the rest is a discriminative classifier.
A generative AI is based on a "foundation model or base model", being a reverse of the discriminative classifier.
The Gartner Hype Cycle for AI, 2023, features the generative AI as the peak of inflated expectations to fall down the trough of disillusionment.
Trained to do a wider range of tasks?(such as text synthesis, image manipulation and audio generation), it is commercialized as a general-purpose AI or GPAI system, where GPT-x series included, while being a statistical classification tool, a generative classifier.
Statistical classification together with clustering or regression is a type pattern recognition, "the automated recognition of?patterns?and regularities in?data".?Pattern recognition systems are trained from labeled or unlabeled "training" data, thus differing from the pattern-matching algorithms of web search engines. They are generally categorized according to the type of learning procedure to generate the output value, as supervised or unsupervised, semi-supervised or self-supervised, depending on the types of training data sets D of input instances x < X and output labels y > Y, D = {x, y}, being either labeled or unlabeled or mixed. All is to minimize an expected loss/cost/error function of an incorrect label or maximize a reward/utility/profit/fitness function.
Its classifiers assign each input value to one of a given set of classes, identifying a set of?categories?(sub-populations) to which an observation(s) belongs, analyzed as feature vectors in a multidimensional probability space, quantifiable properties or variable quantities, explainable variables (independent variables, regressors, etc.) The properties/variables/data are ordered as categorical, ordinal, integer-valued or real-valued variables.
A?classification algorithm?that implements the mathematic
al?function that maps input data to a category is a?classifier working by comparing observations to previous observations by means of a?similarity measure/function/metric?or?a metric/distance?function. It functions by making data-driven inferences or learning, decisions or predictions, building mathematical models from input data.
In general, there are two main approaches, as the?generative?approach and the?discriminative?approach, differing in the degree of?statistical modelling.
A?generative model, a?statistical model?of the?joint probability distribution, P(X, Y) on given?observable variable?X?and?target variable?Y,
A?discriminative model?is a model of the?conditional probability P(Y/X) of the target?Y, given an observation?X=x.
Or,
In probabilistic settings, x and y are samples of random variables X and Y, Y is the variable being predicted, with ?(x) being the predicted values.
In such cases, where a single value of x can correspond to several values of y, the best choice for ?(x) (in order to minimize the mean squared error) is the conditional expectation E[Y|X=x].
This means that if you train a very expressive neural network to predict y given x (with a sufficiently big dataset), then your network would converge to E[Y|X=x].
Similarly, the best choice for x?(y) is E[X|Y=y] — if you train your very expressive network to predict x given y, then it converges to E[X|Y=y].
Hence, the question of how ?(x) relates to x?(y) in probabilistic settings can be rephrased as how the conditional expectations E[Y|X=x] and E[X|Y=y] relate to each other.
The nature of probabilistic relationships are rather counter-intuitive.
Or, the probabilistic linear relationship ?(x) = αx does not necessarily have a linear inverse of the form x?(y) = βy due to a ‘noise’ or ‘error term’, Z, an additional random variable: Y = aX + Z.
Generalizing, if y can be estimated as a linear or nonlinear function of causal variable x that implies that x can also be estimated as a linear or nonlinear function of causal variable y.
This is about a fundamental input-output data interaction rule which is formally reflected by Bayes' theorem (alternatively Bayes' law or Bayes' rule):
P(X/Y)P(Y) = P(Y/X)P(X),
where X and Y are originally interacting events, interconnecting the posterior conditional and marginal probabilities with the prior and conditional likelihood probabilities and interpreted as input variables probability distribution X and outputs Y. Due to its interactive reversibility, one can estimate a generative model given the discriminative model, and vice versa.
Through the combination of generative models and deep neural networks, one deep generative models (DGMs) having hundreds of millions or billions of parameters, becoming very large deep generative models, as Large Language Models generative AI applications.
So, all generative AI models are essentially statistical classifiers marked by Infinite Data, Infinite Neural Networks, and Infinite Compute Power, with no scientific world models, or real learning and Intelligence.
The Generative AI Infrastructure Stack
This is the reason of its conceptual and technical complexities, as one could see from the Generative AI Infrastructure Stack of 9 levels, as described in the article, The Building Blocks of Generative AI:
Semiconductors, Chips, Cloud Hosting, Inference, Deployment
Generative AI models request powerful computational resources and large datasets for training and generating outputs. GPUs and TPUs (specialized chips), along with cloud computing platforms, form the base of the Generative AI infrastructure stack. Cloud platforms like AWS, Microsoft Azure, and Google Cloud provide scalable resources and GPUs for training and deploying generative AI models. GPU leader Nvidia recently crossed a $1 Trillion market cap.
Orchestration Layer / Application Frameworks
It supposed to facilitate seamless integration of AI models with different data sources, empowering developers to launch applications quickly.
Vector Databases diagrammed as tables with infinite columns
They are a specialized type of database that stores data in a manner that facilitates finding similar data by representing each piece of data as a list of numbers, known as a vector, corresponding to the features or attributes of the data. Vector databases presumed to represent the semantic meaning of data, enabling tasks like similarity search, recommendation, and classification. Several companies are specialized to develop vector databases and embeddings.
Fine-TuningFine-tuning involves further training a model on a specific task or dataset to enhance the model’s performance and adapts it to meet the unique requirements of that task or dataset.
Labeling
Accurate data labeling is crucial for the success of generative AI models.
Data can take various forms, including images, text, or audio. Labels serve as descriptions of the data. A set of labels is supposed "to teach the machine learning model what it needs to know". Data labeling plays a significant role in machine learning, as algorithms "learn" from data. The accuracy of the labels directly impacts the algorithm’s learning capabilities. Every AI startup or corporate R&D lab faces the challenge of annotating training data to teach algorithms what to identify. Whether it’s doctors assessing the size of cancer from scans or drivers marking street signs in self-driving car footage, labeling is a necessary step. Inaccurate data leads to inaccurate results from the models. Data labeling remains a significant challenge and obstacle to the advancement of machine learning and artificial intelligence in many industries. It is costly, labor-intensive, and challenging for subject experts to allocate time for it, leading some to turn to crowdsourcing platforms when privacy and expertise constraints are minimal. As it was revealed in the Time article:
"The world’s most powerful AI models are often trained on colossal amounts of data scraped from the internet. These huge datasets often contain copyrighted material, which has opened companies like Stability AI—the maker of Stable Diffusion—up to?lawsuits?that allege their AIs are unlawfully reliant on other people’s intellectual property. And because the internet can be a terrible place, large datasets also often contain toxic material like violence, pornography and racism, which—unless it is scrubbed from the dataset—can lead AIs to behave in ways they’re not supposed to".
Synthetic Data
Artificially created data that mimics real data, offers several benefits and applications in the realm of machine learning and artificial intelligence (AI). Synthetic data safeguards privacy, as it lacks personally identifiable information (PII) and HIPAA risks. Compliance with data regulations, such as GDPR, is ensured while effectively utilizing data.
Model Supervision / AI Observability
The next level of the stack is AI observability, which is about monitoring, comprehending, and explaining the behavior of AI models. Put simply, it ensures that AI models are functioning correctly and making unbiased, non-harmful decisions. Model supervision, which is a subset of AI observability, specifically focuses on ensuring that AI models align with their intended purpose. It involves verifying that models aren’t making decisions that could be harmful or unethical.
Model Safety
At the top of the stack is model safety. One significant risk with generative AI is biased outputs. AI models tend to adopt and propagate biases present in the training data. Another concern is the malicious use of AI. Deep fakes, which involve the dissemination of false information through believable yet fabricated images, videos, or text, might become an issue.
A big existential question for generative AI: How does This All Fit Together, how these all levels and its many companies could smoothly interact, cooperate collaborate and synergize?
领英推荐
It needs an intelligent command, control and communication center, which is a universal formal world model with a universal data architecture, like worldviews for deep human intelligence, for creating a universal AI/ML/DL classifier.
Real AI Models: Generative AI + Universal World Model Engine (Machine's Universal Classifier): UFO (Universal Formal Ontology)
Machine learning and artificial intelligence, as statistical classifiers, generative or discriminative, with the classification algorithms and pattern recognition systems, are superficially correlative and meaningless, without ontological, semantic and scientific classifiers.
The human's and machine's system categories that encompass the classification of all things in the world consists in Universal Formal Ontology (UFO).
Covering statistical and probabilistic and scientific classifiers, such a universal classifier is formalized as the Universal Computing Ontology of Fundamental Categorical Variables of the World:
W = <E, S, C, I; D>, where
The UFO Data Universe determines Data Architecture, a set of rules, policies, standards and models that govern and define the type of data, create and manage the flow of data and how it is processed across IT/AI/ML/DL/GPT systems and applications. Specially, it concerns to enterprise ML/AI data architecture consisting of three different layers or processes:
So, the universal learning and understanding of reality follows the universal master algorithm:
Reality causes Entity causes State causes Change causes Interaction causes Data causes Intelligence causes Real AI Technology causes Intelligent Reality.
THE UNIVERSAL AI PLATFORM FOR NARROW AI, ML, DL, AGI, ASI, AND HUMAN INTELLIGENCE
Now, we could understand how and why to create the real and truly intelligent machines as universal AI networks integrating all interactive and interoperable AI forms and models, algorithms and systems (see the Resources):
Universal AI Platform = General AI = IAI = Real AI = Transdisciplinary AI = Man-Machine Hyperintelligence = UFO + Symbolic/Logical/General AI + Weak/Narrow AI + Machine Learning + Deep Learning + Federated Learning + ANNs + LLMs (GPT > ChatGPT >) +5-6G + Multi-Access Edge Computing + the Internet of Things = Global AI Internet of Everything
The Universal AI Platform (Trans-AI) is embracing the major AI innovations, such as specified in the 2022 Gartner Hype Cycle for AI and in the 2023 Hype Cycle for AI Technology:
Data-centric AI:
synthetic data,
knowledge graphs,
data labeling and annotation
Model-centric AI:
physics-informed AI,
composite AI,
causal AI,
generative AI,
foundation models and deep learning
Applications-centric AI:
AI engineering,
decision intelligence,
edge AI,
operational AI systems,
ModelOps,
AI cloud services,
smart robots,
natural language processing (NLP),
autonomous vehicles,
intelligent applications,
computer vision
Human-centric AI:
AI trust, risk and security management (TRiSM),
responsible AI,
digital ethics,
and AI maker and teaching kits.
Causal AI includes different techniques, like causal graphs and simulation, that help uncover causal relationships to improve decision making.
Real Artificial Intelligence (RAI/SET) Science, Engineering and Technology
Real Artificial Intelligence (RAI/SET) Science, Engineering and Technology is set to change how our world works and humans live, study, work, and play.
RAI/SET is the main engine of the new total digital revolution, scientific and technological, social and economic, cultural and religious.
The COVID-19 crisis has accelerated the need for human-machine digital intelligent platforms facilitating new knowledge, competences and workforce skills, advanced cognitive, scientific, technological, and engineering, social, and emotional skills.
In the AI and Robotics era, there is a high demand for the scientific knowledge, digital competence, and high-technology training in a range of innovative areas of exponential technologies, such as artificial intelligence, machine learning and robotics, data science and big data, cloud and edge computing, the Internet of Thing, 5G, cybersecurity and digital reality.
The combined value – to society and industry – of digital transformation across industries could be greater than $100 trillion over the next 10 years.
“Combinatorial” effects of AI, ML, DL, Robotics with mobile, cloud, sensors, and analytics among others – are accelerating progress exponentially, but the full potential will not be achieved without the collaboration between humans and machines.
Conclusion
To build truly intelligent machines, we have to teach them the universal world model and data ontology, or how to interact with the world, founding AI/ML/DL/GPT models on the world hypergraph interaction networks.
Resources
Real AI Project Confidential Report: How to Engineer Man-Machine Superintelligence 2025: AI for Everything and Everyone (AI4EE); 179 pages, EIS LTD, EU, Russia, 2021
Content
The World of Reality, Causality and Real AI: Exposing the great unknown unknowns
Transforming a World of Data into a World of Intelligence
WorldNet: World Data Reference System: Global Data Platform
Universal Data Typology: the Standard Data Framework
The World-Data modeling: the Universe of Entity Variables
Global AI & ML disruptive investment projects
USECS, Universal Standard Entity Classification SYSTEM:
The WORLD.Schema, World Entities Global REFERENCE
GLOBAL ENTITY SEARCH SYSTEM: GESS
References
Supplement I: AI/ML/DL/CS/DS Knowledge Base
Supplement II: I-World
Supplement III: International and National AI Strategies