Unreasonable Effectiveness of Mathematics - Named Sets, Knowledge Structures, Theory of Oracles, Structural Machines, "Strong AI" and all that Jazz
Rao Mikkilineni Ph D.
CTO at Opos.ai, Distinguished Adjunct Professor at Golden Gate University, California, and Adjunct Associate Professor at Dominican University of California.
Summary
Major attributes of an intelligent system are self-awareness, self-management, local autonomy and knowledge of global interaction and information sharing. A self-managing system by definition implies two components in the system: the observer (or the Self with an identity – the inside) and the observed (or the environment -the outside) with which the observer interacts by monitoring and controlling various aspects that are of importance to the observer. It also implies that the observer is aware of systemic goals (including inside and outside) to monitor, measure and control (local autonomy) its interaction with the observed. In living organisms, the self-management behavior is attributed to the sense of self and to the awareness, which contribute to defining one’s multiple tasks to reach specific goals within a dynamic environment and adapting the behavior accordingly.
This inside view of the outside is contrary to current digital computing structures created by software and hardware, which emerged from the simple theory of Turing machine (TM) “replacing the man in the process of computing a real number by a machine which is capable of only a finite number of conditions.” Current IT is based on von Neumann’s stored program control implementation of the Turing machine and addresses concurrent and synchronous distributed computations (connected Turing Machines provide sequential computing). Self-managing systems have autonomy and their information processing and information networking processes are asynchronous. These processes are managed by an overlay of cognitive processes which, use learning and history to manage themselves. Their interactions with each other and their environment are aimed at establishing stable equilibria by sharing and managing information.
In this post, we examine the characteristics of self-managing systems and propose that the new mathematical insights about named sets, knowledge structures, theory of Oracles and structural machines offer a novel path to design and implement a new class of self-managing digital systems that are sentient, resilient and intelligent. We argue that the knowledge structures can be implemented using managed microservices and hierarchical cognizing agents (both implemented as TMs.)
The Jazz metaphor aptly describes the current transition from Church-Turing thesis governed computing structures to hierarchical-intelligence-governed information processing structures based on the theories of named sets, knowledge structures, oracles and structural machines. The thesis in the IT evolution is the automation of business processes and service delivery using Church-Turing thesis bounded computational structures implementing both business process automation and neural network based deep learning. The antithesis is the limitations of Church-Turing thesis to address rapid fluctuations in the demand for resources, incompleteness of logical systems, uncomputability and the self-referential circularity of all logical systems not being moored to reality outside of themselves. The synthesis of the Jazz metaphor applies to the application of new mathematics to build hierarchical intelligence to manage concurrent autonomous distributed computing structures that communicate asynchronously to monitor and maintain global consistency while preserving local autonomy. The future is in building sentient, resilient and intelligent digital distributed information processing structures that mimic living organisms with mind-body-brain model.
The Audience for this Post
This post is intended for young scientists, mathematicians, philosophers, computer engineers who are curious and willing to learn new approaches to Information processing in the digital world. It is also intended for those physicists, biologists, mathematicians, computer engineers, computer scientists and philosophers who connect the dots across multiple disciplines for their comments and crtique.
It is not "dumbed down" for VC's or address marketeers who are looking for taglines in their twitter feeds to influence "AI community."
I apologize for this not being a scholorly journal article and it does not follow the conventions of proper references although I have cited many papers that I have used or published myself in conferences and peer-reviewed journals. Some of this work started with my publishing a paper (with my colleagues) in the Turing Centenary Conference Proceedings in 2012.
Any comments, critique and suggestions only will improve and hasten the application of the theories to dramatically improve our information technologies to create digital systems that are sentient, resilient and intelligent.
1. Introduction
“There is a story about two friends, who were classmates in high school, talking about their jobs. One of them became a statistician and was working on population trends. He showed a reprint to his former classmate. The reprint started, as usual, with the Gaussian distribution and the statistician explained to his former classmate the meaning of the symbols for the actual population, for the average population, and so on. His classmate was a bit incredulous and was not quite sure whether the statistician was pulling his leg. "How can you know that?" was his query. "And what is this symbol here?" "Oh," said the statistician, "this is pi." "What is that?" "The ratio of the circumference of the circle to its diameter." "Well, now you are pushing your joke too far," said the classmate, "surely the population has nothing to do with the circumference of the circle."”
This story is from Eugene Wigner’s talk [1] titled “The Unreasonable Effectiveness of Mathematics In The Natural Sciences.” He goes on to say “The first point is that mathematical concepts turn up in entirely unexpected connections. Moreover, they often permit an unexpectedly close and accurate description of the phenomena in these connections. Secondly, just because of this circumstance, and because we do not understand the reasons of their usefulness, we cannot know whether a theory formulated in terms of mathematical concepts is uniquely appropriate.”
Once again mathematics has shown up in an unexpected connection dealing with information processing structures. We describe here the new mathematics of named sets, knowledge structures, theory of oracles and structural machines and how they allow us to advance digital information processing structures to become sentient, resilient and intelligent. Sentience comes from the Latin sentient-, "feeling," and it describes things that are alive, able to feel and perceive, and show awareness or responsiveness. The degree of intelligence (the ability to acquire and apply knowledge and skills) and resilience (the capacity to recover quickly from non-deterministic difficulties without requiring a reboot) depend on the cognitive apparatuses the organism has developed.
While there are many scholarly books, articles and research papers published in the last decade explaining both the theory and few novel implementations that demonstrate the power of the new mathematics, they are not yet understood well by many. Here is an open secret that is up for grabs for any young computer scientist or IT professional with curiosity to make a major impact in shaping next generation information processing systems which are “truly” self-managing and therefore, sentient, resilient and intelligent. What you need is an open mind and a willingness to challenge the status-quo touted by big companies with lot of money and marketing prowess. In this post, I will try to share what I understood from a lot of reading and trying to implement some of the concepts to create sentient, resilient and intelligent distributed digital information processing systems.
2. Characteristics of Self-Managing Systems:
Here is an excerpt from a paper I published that summarizes some characteristics an autonomic system must possess based on my readings of various scholars who studied cognition [2]. “An autonomous system is typically considered to be a self-determining system, as distinguished from a system whose behavior is explicitly externally engineered and controlled. The concept of autonomy (and autonomous systems) is, therefore, crucial to understanding cognitive systems. According to Maturana [3,4] a cognitive system is a system whose organization defines a domain of interactions in which it can act with relevance to the maintenance of itself, and the process of cognition is the actual (inductive) acting or behaving in this domain. If a living system enters into a cognitive interaction, its internal state is changed in a manner relevant to its maintenance, and it enters into a new interaction without loss of its identity. A cognitive system becomes an observer through recursively generating representations of its interactions, and by interacting with several representations simultaneously it generates relations with the representations of which it can then interact and repeat this process recursively, thus remaining in a domain of interactions always larger than that of the representations. In addition, it becomes self-conscious through self-observation; by making descriptions of itself (representations), and by interacting with the help of its descriptions it can describe itself describing itself, in an endless recursive process.”
These observations lead us to conclude that self-management is an outcome of cognitive abilities of a system with certain defining attributes of cognitive systems. Cognition therefore is an essential part of any design of information processing structure to exhibit self-management. Key attributes of a cognitive system must have:
- A knowledge of its state and the ability to compute (evolve its state from current state to next state based on process evolution knowledge.) In cellular organisms, this knowledge is incorporated in the “Gene” and allows knowing and changing its state through physical, chemical and biological information processing structures by converting matter to energy (the metabolism);
- A self-identity that does not change when a state change occurs with interaction;
- A domain of interaction;
- A cognitive interaction process (with sensors and actuators supported by neurons in nerve cells or brain) support that allows an observer to generate recursively representations of its interactions. The “observer” by interacting with several representations simultaneously, generates relations with the representations of which it can then interact and repeat this process recursively, thus remaining in a domain of interactions always larger than that of the representations.
As I pointed out in the same paper, “the autonomy in cellular organisms comes from three sources:
- Genetic knowledge that is transmitted by the survivor to its successor in the form of executable workflows and control structures that describe stable patterns to optimally deploy the resources available to assure the organism’s safe keeping in interacting with its environment.
- The ability to dynamically monitor and control organism’s own behavior along with its interaction with its environment using the genetic descriptions and
- Developing a history through memorizing the transactions and identifying new associations through analysis. In short, the genetic computing model allows the formulation of descriptions of workflow components with not only the content of how to accomplish a task but also provide the context, constraints, control and communication to assure systemic coordination to accomplish the overall purpose of the system.”
As I wrote in another paper with Morana [5], a cellular organism is the simplest form of life that maintains an internal environment that supports its essential biochemical reactions, despite changes in the external environment. The cell adapts to its environment by recognition and transduction of a broad range of environmental signals, which in turn activate response mechanisms by regulating the expression of proteins that take part in the corresponding processes. The regulatory gene network forms a cellular control circuitry defining the overall behavior of the various cells. The complex network of neural connections and signaling mechanisms collaborate to create a dynamic, active and temporal representation of both the observer and the observed with myriad patterns, associations and constraints among their components.
It seems that the business of managing life is more than mere book-keeping that is possible with a Turing machine. It involves the orchestration of an ensemble with a self-identity both at the group and the component level contributing to the system’s biological value. It is a hierarchy of individual components where each node itself is a sub-network with its own identity and purpose that is consistent with the system-wide purpose. To be sure, each component is capable of book-keeping and algorithmic manipulation of symbols. In addition, identity and representations of the observer and the observed at both the component and group level make system-wide self-reflection possible. In short, the business of managing life is implemented by a system consisting of a network of networks with multiple parallel links that transmit both control information and the mission critical data required to sense and to control the observed by the observer. The data and control networks provide the capabilities to develop an internal representation of both the observer and the observed along with the processes required to implement the business of managing life. The organism is made up of autonomic components making up an ensemble collaborating and coordinating a complex set of life’s processes that are executed to sense and control both the observer and the observed.”
The self-managing system architecture supports distributed concurrent information processing structures with local autonomy and global sharing of information (asynchronous communication) to create sentient, resilient and intelligent processes which establish equilibrium states of interaction within the self and with the environment in a non-deterministic fluctuating environment through homeostasis and autopoiesis. Homeostasis is the tendency toward a relatively stable equilibrium between interdependent elements, especially as maintained by physiological processes. Autopoiesis is the property of a living system (such as a bacterial cell or a multicellular organism) that allows it to maintain and renew itself by regulating its composition and conserving its boundaries
In the next section we will discuss the Turing machine shortfall and argue that it is more suitable to simulate the cognitive activity and such a simulation transcends the mere book-keeping capabilities of a Turing machine.
3. Why Turing Machine-Based information Processing Structures Fall Short in Designing and Implementing Self-Managing Systems Despite Many Vehement Claims to the Contrary:
One of the main steps towards infusing cognition into current digital information processing structures to make them self-managing is to make visible host of myths surrounding the old paradigm and helping it to survive. One of those myths is that our modern computers with all their programming languages are diverse implementations of Turing machines. Gordana Dodig-Crncovic and Raffaela Giovagnoli provide a succinct discussion of the limitation of Turing Machine-based information structures in their book “Computing Nature – Turing Centenary Perspective” [6]. They point out two factors that inhibit infusing cognition into TM-based information processing structures.
Self-managing systems require a self-referential model with local autonomy and global information sharing to model, monitor and manage their homeostasis and autopoiesis processes. In essence, self- managing systems are distributed concurrent information processing structures with asynchronous communication. TMs, on the other hand, when connected execute synchronous distributed and concurrent processes. “Computational logic must be able to model interactive computation, and that classical logic must be robust towards inconsistencies i.e. must be paraconsistent due to the incompleteness of interaction”
As I pointed out in my Turing Centenary Conference paper in 2012, [7] “ An important implication of G?del’s incompleteness theorem [8] is that it is not possible to have a finite description with the description itself as the proper part. In other words, it is not possible to read yourself or process yourself as a process. However, as Turing [9] put it beautifully, “the well-known theorem [8] shows that every system of logic is in a certain sense incomplete, but at the same time it indicates means whereby from a system L of logic a more complete system L_ may be obtained. By repeating the process, we get a sequence L, L1 = L_, L2 = L_1 … each more complete than the preceding. A logic Lω may then be constructed in which the provable theorems are the totality of theorems provable with the help of the logics L, L1, L2, … Proceeding in this way we can associate a system of logic with any constructive ordinal. It may be asked whether such a sequence of logics of this kind is complete in the sense that to any problem A there corresponds an ordinal α such that A is solvable by means of the logic.
“On the other hand, it seems as if, biology does not pay attention to G?del’s incompleteness or undecidability theorems. As George Dyson [10] points out, a recursive computing model in the genome enables the beautiful unfolding of living organisms with self-configuration, self-monitoring, self-protection, and self-healing properties. "The analog of software in the living world is not a self-reproducing organism, but a self-replicating molecule of DNA. Self-replication and self-reproduction have often been confused. Biological organisms, even single-celled organisms, do not replicate themselves; they host the replication of genetic sequences that assist in reproducing an approximate likeness of themselves. For all but the lowest organisms, there is a lengthy, recursive sequence of nested programs to unfold. An elaborate self-extracting process restores entire directories of compressed genetic programs and reconstructs increasingly complicated levels of hardware on which the operating system runs."
Many people during last couple of decades were calling for computing models that go beyond TM-based computing structures but silenced by the computer scientists promoting status quo. In this post we demonstrate how new mathematics points a way to create self-managing systems that could evolve just as biological systems did. More importantly, we can design systems that embody cognitive processes in their evolution to create a new class of sentient, resilient and intelligent systems as a first step. These systems will not include emergence but are designed to reconfigure themselves to manage non-deterministic fluctuations.
4. Going Beyond Church-Turing Thesis Boundaries: Digital Genes, Digital Neurons and the Future of AI
The ingenuity of von Neumann’s stored program control implementation of the Turing machine is that it provides a physical implementation of a cognitive apparatus to represent and transform knowledge structures that are created by physical or mental worlds in the form of input and output. The cognitive apparatus endows physical locality and information velocity in the form of computing and read/write speeds depending on the physical implementation. Figure 2 represents the implementation of Turing Machines as cognitive apparatuses each endowed with locality, velocity of information flow and the ability to form information processing structures among themselves.
Figure 2: Turing Machine implementation of the information processing structure
Any algorithm that can be specified is made executable using CPU and Memory. Execution of the algorithm changes the state from input to output. As long as there are enough resources (CPU and memory), the computation will continue as encoded in the algorithm. Cognition comes from the ability to encode knowledge structures and their processing to transform them from one state to another just as genes in biology do.
It is interesting to note that the Turing computable functions also include algorithms that define neural networks which are used to model processes that cannot be described themselves as algorithms such as voice recognition, video processing etc. Cognition comes from the ability to encode knowledge structures and their processing to transform them from one state to another just as genes and neurons in biology do. This is equivalent to a digital gene (representing well-specified executable process evolutions) and a digital neuron (executing the cognitive processes that cannot be specified as genes) mimicking the biological systems.
Turing machine implementations of information processing structures as G?del proved suffer from incompleteness and recursive self-reference and therefore require external agents to instruct them and judge their outputs. Cockshott et al., [11] conclude their book “Computation and its Limits” with the sentence “Their logical limits arise when we try to get them to model a part of the world that includes themselves.”
We discuss the limitations of current information processing structures in the next section.
5. Limitations of the Current State of the Art
Symbolic computing, while it has been very successful in driving advances in our digital information processing systems, has reached a limitation to realize further improvements in their resiliency, efficiency and scalability. Similarly, digital neural networks, used in Deep Learning has many applications in processing massive amounts of voice, text, pictures and video to extract hidden correlations and gain insights, is also reaching its own limits in improving accuracy, lack of explicability and the ability to model and predict behaviors using observations from experience.
In this section, we address these shortcomings and discuss how to go beyond computation and its limes and also go beyond Deep Learning and its limits.
5.1. Limitations of Symbolic Computing
There are two new drivers that are testing the boundaries of Church-Turing thesis:
1 Current business services demand non-stop operation and their performance adjusted in real-time to meet rapid fluctuations in service demand or available resources. The speed with which the quality of service has to be adjusted to meet the demand is becoming faster than the time it takes to orchestrate the myriad infrastructure components (such as virtual machine (VM) images, network plumbing, application configurations, middleware etc.) distributed across multiple geographies and owned by different providers. It takes time and effort to reconfigure distributed plumbing which results in increased cost and complexity. A new architecture must decouple application execution from the infrastructure orchestration across distributed resource providers and enable self-managing applications with or without VMs. Application provisioning, monitoring and reconfiguration must be automated and proactively managed with the global knowledge of available resources, their utilization and the need for adjustment to meet the fluctuations in demand or availability. In this post, we discuss the role of hierarchical cognizing agents to provide sentience, resilience and scalability with global knowledge and local autonomy.
2 Application security has to become self-regulating and tolerant to manage “weak” or “no” trust in the participating entities whether they are other service components, or people or devices. The solution requires decoupling of service security mechanisms (authentication, authorization and accounting) from myriad infrastructure and service provider security operations. In addition, as Steve Jobs demonstrated with the blue-box, the service control path has to be separated from the data path to provide infrastructure independent end-to-end security which also demands the application network management to be decoupled from the infrastructure network management with application in the driver’s seat. Privacy dictates that the data ownership has to be preserved with its rightful owners and a mechanism provided to track and leave control with rightful owners. In this post, we argue that the hierarchical cognizing agents in the structural machines provide a means to implement security management.
Church-Turing thesis boundaries are challenged when rapid non-deterministic fluctuations drive the demand for resource readjustment in real-time without interrupting the service transactions in progress. The information processing structures using structural machines we discuss here, provide autonomous and predictive behavior by extending their cognitive apparatuses to include themselves, their resources and their behaviors along with the information processing tasks at hand. Figure 3 shows hierarchical cognizing agents (Oracles) managing the downstream information processing structures based on their blueprints defining the process intent, constraints, available resources, sensors and actuators to manage process evolution along with deviation control from the process intent [12].
Figure 3: Hierarchical network of cognizing agents managing downstream process execution based on the process blueprint
Each cognizing agent has the knowledge of configuring, monitoring, and managing deviations from the intent in the form of a blueprint defining the process evolution using sensors and actuators. An implementation of this architecture to demonstrate multi-cloud orchestration of a web service to implement auto-failover, auto-scaling and live-migration of application components without disrupting the service in [12]. The authors describe the features demonstrated that are different from the current state of the art:
- Migrating a workflow executed in a physical server (a web service transaction including a webserver, application server and a database) to another physical server without a reboot or losing transactions while maintaining recovery time and recovery point objectives.
- Provide workflow auto-scaling, auto-failover and live migration using distributed computing clusters with heterogeneous infrastructure (bare metal servers, private and public clouds etc.) without infrastructure orchestration to accomplish them (e.g., moving virtual machine images).
- Create an interoperable private and public cloud network and enable cloud agnostic computing with workflow auto-failover, auto-scaling and live migration across clouds without losing transactions.
5.2. Limitations of Digital Neural Networks:
Deep learning, has delivered a variety of practical uses in the past decade. From revolutionizing customer experience, machine translation, language recognition, autonomous vehicles, computer vision, text generation, speech understanding, and a multitude of other AI applications.
Deep learning models do not require algorithms to specify what to do with the data. Extraordinary amount of data we as humans, collect and consume — is fed to deep learning models. An artificial neural network takes some input data, and transforms this input data by calculating a weighted sum over the inputs and applies a non-linear function to this transformation to calculate an intermediate state. The three steps above constitute what is known as a layer, and the transformative function is often referred to as a unit. The intermediate states—often termed features—are used as the input into another layer. Through repetition of these steps, the artificial neural network learns multiple layers of non-linear features, which it then combines in a final layer to create a prediction.
The neural network learns by generating an error signal that measures the difference between the predictions of the network and the desired values and then using this error signal to change the weights (or parameters) so that predictions get more accurate.
Therein lies the limitation of Deep Learning. While we gain insights about hidden correlations, extract features and distinguish categories, we lack transparency of reasoning behind these conclusions. Most importantly there is the absence of common sense. Deep learning models might be the best at perceiving patterns. Yet they cannot comprehend what the patterns mean, and lack the ability to model their behaviors and reason about them.
True intelligence involves generalizations from observations, creating models, deriving new insights from the models through reasoning. In addition, human intelligence also creates history and uses past experience in making the decision.
Based on our knowledge of how natural intelligence works, we can surmise that the following key elements of human mind, which leverage the brain and the body at cellular level, are missing in current state of the art A.I.:
- Time Dependence & History of Events: In Nature, systems are continuously evolving and interacting with each other. Sentient systems (with the capacity to feel, perceive or experience) evolve using a non-Markovian process, where the conditional probability of a future state depends on not only the present state but also on its prior state history. Digital systems, to evolve to be sentient and mimic human intelligence, must include time dependence and history in their process dynamics.
- Knowledge Composition and Transfer Learning: The main outcome of this ability is to understand and consequently predict behaviors by a succession of causal deductions supplementing correlated inductions.
- Exploration vs. Exploitation dilemma: Creativity and expertise are the consequences of our ability to swap from the comfort zone to unchartered territories and it’s a direct and key usage of our transfer learning skill. Analogies and Translations are powerful tools of creativity using knowledge in a domain and applying it in another.
- Hierarchical structures: As proved by G?del, an object can only be described (and managed) by an object of a higher class. A key principle of how cells are working by exchanging proteins whose numbers, functions, and messages are supervised by DNA at cell level or group (higher) level.
In this post, we address how to go beyond the symbolic computation and its limits, and also augment Deep Learning with Deep Reasoning based on Deep Knowledge and Deep Memory.
6. Going Beyond Computation and its Limits
Todays’ computing has evolved to execute very complex algorithms by processing multitudes of data streams, which creates new representations i.e. the knowledge of the system. The algorithms are designed to address the evolution of data. While the “intent” of the algorithm is well defined in terms of a sequence of steps, the resources and the time required for executing the intent depends on many factors outside the specification and scope of the algorithm itself. Computing resources such as the speed and memory determine the outcome of the execution. The nature of the algorithm also dictates the resources required. As the demand for information and its processing to be available anywhere, anytime in any form, the Church-Turing thesis assertion related to “ignoring resource limitations” breaks down. Computations suffer when fluctuations occur in either the availability or demand for CPU and memory.
Burgin in his book Super-Recursive Algorithms [17] emphasizes that “efficiency of an algorithm depends on two parameters: power of the algorithm and the resources that are used in the process of solution. If the algorithm does not have necessary resources, it cannot solve the problem under consideration.” The computing resources required for the computation depend both on their availability in terms of CPU, memory, network bandwidth, latency, storage capacity, IOPs and throughput characteristics of the hardware, and on the nature of the algorithm (the software). The efficiency of execution of the computation depends upon managing the dynamic relationship of the hardware and the software to monitor the fluctuations and adjusting the resources to execute the computation. The experience of a new software architecture (Distributed Intelligent Managed Element - DIME) that have been developed to use the meta-model of the computation and the knowledge about available resources to monitor and manage the computation workflow, is presented in the book Designing a New Class of Distributed Systems and discussed in the Turing Centenary Conference proceedings [7]. The DIME network architecture, presented there with some proofs of concept implementations, allows managing the evolution of the computation by adjusting the resources required through constant monitoring and control with a cognitive management overlay. Its implementation and its impact on designing and managing non-stop digital systems even in the face of rapid fluctuations in either the demand for resources or their availability gives rise to a new class of sentient computing where systems adjust their evolution without any interruption to the execution of its mission [12, 13, 14]. The theoretical basis for the DIME network architecture is provided by the theory of Oracles advanced by Professor Mark Burgin [15, 17].
In this post we focus on going beyond Deep Learning.
6.1. Going Beyond Deep Learning
In order to go beyond Deep Learning, we take the cue frm the neocortex in the human brain. The neocortex plays a crucial role in the control of higher perceptual processes, cognitive functions, and intelligent behavior. It acts as higher-level information processing mechanism that uses re-composable neural subnetworks to create a hierarchical information processing system with predictive and proactive analytics. In other words, it allows us to learn, adapt and create new modes of behavior.
In order to go beyond Deep Learning, we take the cue frm the neocortex in the human brain. The neocortex plays a crucial role in the control of higher perceptual processes, cognitive functions, and intelligent behavior. It acts as higher-level information processing mechanism that uses re-composable neural subnetworks to create a hierarchical information processing system with predictive and proactive analytics. In other words, it allows us to learn, adapt and create new modes of behavior.
Ray Kurzweil [16] succinctly summarizes the nature and function of the neocortex. The basic unit of the neocortex is a module of neurons which he estimates at around a hundred. The pattern of connections and synaptic strengths within each module is relatively stable. He emphasizes that it is the connections and the synaptic strengths between the modules that represent learning. The physical connections between these modules are created to represent hierarchical information pattern processing workflows in a repeated and orderly manner. He asserts that the real strength of the neocortex is that the connections are built hierarchically, reflecting the natural hierarchical order of reality. He goes on to say “the basic algorithm of the neocortical pattern recognition module is equivalent across the neocortex from “low-level” modules, which deal with the most basic sensory patterns, to “high level” modules, which recognize the most abstract concepts”. The universal nature of this algorithm provides a stable architecture giving rise to plasticity and predictive management of information processing. The hierarchy is inherently recursive and supports redundancy. The signals go up and down the conceptual hierarchy. A signal going up means “I have detected a pattern” and a signal going down means, “I am expecting your pattern to occur” which is essentially a prediction. “Both upward and downward signals can be excitatory or inhibitory”. These insights into how the biology works are very valuable for us to design solutions that infuse cognition into silicon based information processing systems.
The long and short of these observations is that there are hierarchical computing structures that go beyond neural networks to provide models of the observations, abstractions and generalizations from experience and time and history to provide reasoning and predictive behaviors. The models comprising of deep knowledge are designed to capture not only classification of objects, their attributes and relationships but also behaviors associated with them. These behaviors captured as generalizations from history and observations. At any point of time, any new event triggers an evolution of the current state to a future state based on not only the current state but also from its past history.
The non-Markovian behavior gives rise to a new level of intelligence that goes beyond mere computing, communication and cognition alone support. In order to model this level of intelligence, we propose a superrecursive neural network, an ontology based model of the domain of interest created from various pieces of knowledge (observations, experience, science, common sense etc.) and memory that captures time and history of various instances populating the model. Figure 4 shows our proposal for the path to strong AI whose goal is to develop artificial intelligence to the point where the machine's intellectual capability is functionally equal to a human's.
The proposal attempts to fill the current gaps with:
- Modeling and generalizations based on domain ontologies, science, mathematics, common sense observations, transfer learning from different domains and creativity/expertise from experts.
- The model consists of knowledge structures that contain entities, relationships and behaviors with potential consequences
- Various instances of the model are populated using Deep Learning and other mechanisms to create Deep Knowledge
- The model and the instances are made executable in a sentient edge cloud using the DIME network architecture mentioned above.
- The history and evolution of the instances are used to create Deep Memory with crypto-security.
- The sentient model mimicking the physical world in the digital world is used to create a Deep Reasoning Module that uses history and simulation to gain insights and take action.
Figure 4: Augmenting Deep Learning with Deep Knowledge, Deep Memory and Deep Reasoning
7. The Theory Behind
Once again mathematics has shown up in an unexpected connection. We describe here the new mathematics of named sets, knowledge structures, theory of Oracles and structural machines and how they allow us to advance digital information processing structures to become sentient, resilient and intelligent.
7.1. Theory of Oracles and Sentient Computing Structures
According to Mark Burgin, structural relationships exist between data which are entities that are observed in the physical world or conceived in the mental world. These structures define the knowledge about them in terms of their properties such as attributes, relationships and behaviors of their interaction. Information processing structures deal with the evolution of these knowledge structures using an overlay of cognitive knowledge structures that model, monitor and manage the evolution of the system as a whole.
The most fundamental entity is called a fundamental triad or a named set. It has the following structure shown in figure 5.
Figure 5: The fundamental triad defining the named set.
An entity or object with its own name is connected to another entity or object with its own name. The connection, which itself has a name, depicts the knowledge about the relationship and or the behavioral evolution when a state change occurs in either object.
A knowledge structure is composed of related fundamental triads and any state change causes behavioral evolution based on the connections. The long and short of the theory of knowledge is that objects, their attributes in the form of data and the intrinsic and ascribed knowledge of these objects in the form of algorithms and processes makeup the foundational blocks for information processing. Information processing structures utilize knowledge in the form of algorithms and processes that transform one state (determined by a set of data) of the object to another with a specific intent. Information structures and their evolution using knowledge and data determine the flow of information. Living organisms have found a way to not only define the knowledge about the physical objects but also to create information processing structures that assist them in executing state changes.
The structural machine framework describes a process which allows information processing through transformation of knowledge structures. It involves a control device that configures, executes information processing operations on knowledge structures and manages the operations throughout its life-cycle using a processor. The processor uses the knowledge structures as input and delivers the processed information as knowledge structures in the output space.
In the special case where input knowledge structure and the output knowledge structures are words (symbols) and the process to be executed is an algorithm (a sequence of operations), then the structural machine becomes a Turing Machine. The control is outside the Turing machine which provides the algorithm to execute, assure the processor has the right resources to perform the operations and judge whether the computation is performed as expected. In essence, the functional requirements of the system that is under consideration such as business logic, sensor and actuator monitoring and control (the computed) etc., are specified as algorithms and are executed by the processor transforming the knowledge structures from the input space to output space. Figure 6 shows the structural machine framework.
Figure 6: Structural Machine Framework for Information Processing Structures
It is important to observe that when the controller is an operator, the processor is a stored program implementation of a Turing machine, the processing space is the memory and the knowledge structures are symbolic data structures, we obtain current state of the art information processing structure.
The knowledge structure depicted here shown a graph that consists of capturing domain knowledge. The picture captures hierarchical, decoupled and concurrent evolving structures whose evolutionary behavior is captured in the fundamental triads or named sets. The knowledge structures go beyond the knowledge captured by taxonomies or ontologies and both can be used to create knowledge structures by adding the connection depicting the relationships and evolutionary behaviors.
Figure 7: A network of knowledge structures showing the state vectors
The knowledge structure depicted here shown a graph that consists of capturing domain knowledge. The picture captures hierarchical, decoupled and concurrent evolving structures whose evolutionary behavior is captured in the fundamental triads or named sets. The knowledge structures go beyond the knowledge captured by taxonomies or ontologies and both can be used to create knowledge structures by adding the connection depicting the relationships and evolutionary behaviors.
According to Greek mythology, "Oracles" answered questions presented to them using some knowledge not available to the other people. Turing spent 1936–1938 at Princeton writing a Ph.D. thesis under Church on ordinal logics. A tiny and obscure part of his paper (Turing 1939) included a description of an Oracle machine (o-machine) roughly a Turing a-machine which could interrogate an ‘‘Oracle’’ (external database) during the computation. The one-page description was very sketchy and Turing never developed it further. Soare (Soare 2009) discussed relative computability using the theory of Oracles. More recently, Burgin has introduced a general theory of Oracles. Burgin (Burgin, Eberbach and Mikkilineni 2019) discusses a generalized theory of Oracles that can perform many functions.
7.2. Non-Markovian Information Processing, Crypto-Security and Deep Memory
The knowledge structures and their history are kept in local memory (storage) with crypto-security. When an event occurs that causes a potential change to the knowledge structure based on various relationships and behaviors, the evolution is carried out by using a deep reasoning machine that uses cognizing agents to examine potential consequences and provide insights based on deep knowledge. Today, most of the heuristics to understand and predict behaviors are Markovian. The probability of each event depends only on the state attained in the previous event. We are using non-Markovian reasoning models because it’s the only way to understand and predict complex and realtime behavior changes of thousands or millions of human beings or machines.
8. Computational Theory of Mind and Deep Reasoning:
According to the classical computational theory of mind, the mind is a computational system similar in important respects to a Turing machine, and core mental processes (e.g., reasoning, decision-making, and problem solving) are computations similar in important respects to computations executed by a Turing machine. According to CCTM, the mind is a computational system similar in important respects to a Turing machine, and core mental processes (e.g., reasoning, decision-making, and problem solving) are computations similar in important respects to computations executed by a Turing machine. However, the classical theory falls short in addressing the cognitive overlay of modeling both the genetic behaviors and the influence of the neural networks. The mind incorporates the model of the body and the cognitive knowledge through the brain and creates a history of events, behaviors along with the underlying correlations in the form of deep memory.
A more recent Real-time Control Structure [18] reference model architecture for intelligent systems has been described and mapped onto the physical structure of the brain. Both the RCS architecture and the brain are hierarchical, with layers of interconnected computational modules that generate functionality of sensory processing, world modeling, value judgment, and behavior generation. At the lower layers, these processes generate goal-seeking reactive behavior. At higher layers, they enable perception, cognition, reasoning, imagination, and long-term planning. Within each hierarchical layer, the range and resolution in time and space is limited. At low layers, range is short and resolution is high, whereas at high layers, range is long and resolution is low. This enables high precision and quick response to be achieved at low layers over short intervals of time and space, while long-range plans and abstract concepts can be formulated at high layers over broad regions of space and time.
Figure 8 shows the representation captured in a hierarchical knowledge structures implemented by the digital neurons and genes. The functions implemented by individual neurons and genes are composed into information processing structural machines managed by cognizing agents.
Figure 8: Mind, body and brain metaphor for information processing in the digital world.
This post proposes hierarchical cognizing agents based on the theory of Oracles to implement digital world representations of real-world physical and mental models using knowledge structures that describe the evolution of domain specific entities, relationships, events and behaviors. Figure 8 shows the representation captured in a hierarchical knowledge structures implemented by the digital neurons and genes. The functions implemented by individual neurons and genes are composed into information processing structural machines managed by cognizing agents.
I conclude this post with an excerpt from [19].
While answering the question "Chicken or the egg, which came first?" it is said [10] that the chicken is an egg's way of making another egg (or we can say replicating itself). The genes in the egg are programmed to replicate themselves using the resources available effectively. They already come with an "intent", the workflows to execute the intent and monitoring and controlling best practices to adjust the course if deviations occur, whether they be from fluctuations in resources or the impact of its interaction with the environment. The intent of the genes, it seems, is the ability to survive and replicate. There is a symbiosis of the genes (which contain the information about the intent, workflows and also process knowledge to execute the intent) and the hardware in the form of chemicals such as amino acids and proteins that provide the means.
A similar symbiosis exists between the software and hardware involved in a computation. It can be equally said that the hardware is software's means of sustaining itself to deliver its intent. The hardware provides the metabolism (here we use the word metabolism loosely to denote the resources required to execute the algorithm following Dyson [10]) to maintain the vital signs of the software (in the form of CPU cycles, memory, network bandwidth, latency, storage capacity, IOPs and throughput). The software contains the algorithms to deliver the intent. In the current state of the art of computations, the software is not aware of its vital signs and the hardware is not aware of the software's intent. They are brought together by external agents who know the intent (functional requirements) and also know the hardware capabilities to match them, monitor the vital signs (nonfunctional requirements) and adjust the circumstance to meet changing demands. When the hardware resources are distributed and shared by multiple software algorithms with different intents and different “metabolic” rates, the agents that mediate grow exponentially increasing both complexity and management fatigue. The problems are compounded when both the scale and fluctuations in the system increase. In order to understand the reason behind this complexity and resulting inefficiency, we have to go back to the software origins and the limitations of the Turing machine model of information processing as we discussed here.
George Dyson [10] speculates on the decoupling of hardware infrastructure (providing “metabolism”) and its management, from the software executing specific processes with an intent and its replication capability: "Metabolism and replication, however intricately they may be linked in the biological world as it now exists, are logically separable. It is logically possible to postulate organisms that are composed of pure hardware and capable of metabolism but incapable of replication. It is also possible to postulate organisms that are composed of pure software and capable of replication but incapable of metabolism". They use available resources anywhere to configure, monitor and manage themselves and their interactions wth each other and their environment.
Post Script
George Gilder’s “Life After Google” points out that the current premise of “free computing in exchange for your soul in the form of privacy to the big companies running big AI programs without consciousness” in the long run is not sustainable. He goes back to the origins of computing machines and points out the flaw in our parent’s computer science (Turing and von Neumann themselves pointed this out in 1948 referring to G?del’s proof of incompleteness and inconsistency in mathematical logic) that limits the computer's ability to surpass human consciousness. The new mathematics point to a solution to these limitations as discussed in this video.
Interesting food for thought.
References
[1] Wigner E.: "The Unreasonable Effectiveness of Mathematics in the Natural Sciences," in Communications in Pure and Applied Mathematics, vol. 13, No. I (February 1960). New York: John Wiley & Sons, Inc. (1960)
[2] R. Mikkilineni, (2012). "Going beyond computation and its limits: Injecting cognition into computing." Applied Mathematics 3, pp. 1826-1835.
[3] H. R. Maturana, “Biological Computer Laboratory Re- search Report BCL 9.0,” University of Illinois, Urbana, 1970.
[4] H. R. Maturana and F. J. Varela, “Autopoiesis and Cognition: The Realization of the Living (Boston Studies in the Philosophy of Science),” D. Reidel, Dordrecht, 1960.
[5] Mikkilineni, R.; Morana, G.; , "Injecting the Architectural Resiliency into Distributed Autonomic Systems Using DIME Network Architecture," Complex, Intelligent and Software Intensive Systems (CISIS), 2012 Sixth International Conference on , vol., no., pp.867-872, 4-6 July 2012
[6] G. Dodig-Crnkovic, R. Giovagnoli, Computing Nature (Springer, Heidelberg, 2013.
[7] Mikkilineni, R., Comparini, A. and Morana, G. (2012a) ‘The Turing o-machine and the DIME Network Architecture: Injecting the Architectural Resiliency into Distributed Computing’, Turing100, The Alan Turing Centenary, EasyChair Proceedings in Computing. Available online at: www.easychair.org/ publications/?page=877986046 (accessed on 03 March 2020).
[8] G?del, K. (1931). Monatshefte für Mathematic und Physik, 38, 173-198.
[9] Turing, A. M. (2004). In B. J. Copeland (Ed.), The Essential Turing. Oxford, UK: Oxford University Press.
[10] Dyson, G. B. (1997). Darwin among the Machines, the evolution of global intelligence. Reading MA: Addison Wesley.
[11] Cockshott, P., MacKenzie, L.M. and Michaelson, G. (2012) Computation and Its Limits, Oxford University Press, Oxford.
[12] Burgin, M. and Mikkilineni, R. (2018) Cloud computing based on agent technology, super -recursive algorithms, and DNA, Int. J. Grid and Utility Computing, v. 9, No. 2, pp.193–204
[13] Mikkilineni, R., Morana, G. and Zito, D. (2015) ‘Cognitive application area networks: a new paradigm for distributed computing and intelligent service orchestration’, Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE), 2015 IEEE 24th International Conference on, Larnaca, pp.51–56
[14] Mikkilineni, R. and Morana, G. (2016) ‘Cognitive distributed computing: a new approach to distributed datacenters with selfmanaging services on commodity hardware’, International Journal of Grid and Utility Computing, Vol. 7, No. 2.
[15] Burgin, M., Eberbach, E., and Mikkilineni, R., (2019). "Cloud Computing and Cloud Automata as A New Paradigm for Computation" Computer Reviews Journal Vol 3 ISSN: 2581-6640. https://purkh.com/index.php/tocomp
[16] Kurzweil, R. (2012). How to create a mind: The secret of human thought revealed. New York: Viking.
[17] Burgin, M. (2005) “Superecursive Algorithms.” Springer, New York.
[18] Albus, J. S. (2008) Toward a Computational Theory of Mind, Journal of Mind Theory Vol. 0 No. 1.
[19] Mikkilineni, R., Morana, G. and Burgin, M. Oracles in Software Networks: A New Scientific and Technological Approach to Designing Self-Managing Distributed Computing Processes, Proceedings of the 2015 European Conference on Software Architecture Workshops, Dubrovnik/Cavtat, Croatia, September 7-11, 2015, ACM, 2015, pp. 11:1-11:8