The Learning Problem ~ Associated Components

The Learning Problem ~ Associated Components

This article investigates the abstracted structure that learning algorithms provide solutions for within the context of neural networks. It also investigates the influences of gradient descent, differential equations, and set theory on the performance behavior of learning algorithms.

First and foremost, it is essential to acknowledge the difference between an analytical solution, which is a solution obtained through a direct mathematical process, and an empirical solution, which is a solution derived from observed data. This distinction is crucial within the framework of learning underlying neural network topology. In short, the deployment of neural network topologies is not motivated to solve analytical solutions; analytical solutions are where an individual has a firm understanding of the mechanics and the physics that govern the movement of the mechanics of an object within a given environment for which the object participates. On the other hand, implementing neural networks lends itself most advantageous to the realm of empirical solutions. Specifically, empirical solutions are similar to evidence-based decision-making. Where the statement above implies, there is no suggestion of the understanding of the underlying governing laws that manufacture such a solution (e.g., mechanics and physics). However, by observing prior history (i.e., historical data inputs with mapped outcomes), an individual may deduce the reciprocating outcome of a given input by emphasizing the pattern by which the historical data inputs provide to that individual. Here, then, this statement just mentioned is important to note because it is with this sense of ambiguity in understanding the mapping of inputs to corresponding outputs; the ambiguity of how neural network topologies fashion an output from a reciprocating input, hence the phrase of "black-box" heard frequently within deep learning, machine learning, and data science communities when speaking of neural networks.

However, what underlying force maps inputs to outputs concerning neural networks? This is an interesting question that promulgates an exciting answer. This is because of the distinguished capacity for neural networks to solve problems using empirical solutions. Here, although analytical solutions imply the definiteness of a function to generate output from input, knowing such a function explicitly is impossible within the context of artificial learning due to the nature of the problems that neural networks solve. To explain it another way, problems that are solved by neural network topologies are classified as Wicked Problems, which are complex, dynamic, and constantly evolving problems. These problems change 'under your feet.' Thus, the solutions to Wicked Problems lose their robustness to solve such issues unless those given solutions are tuned to match the problem by which they have the intention of solving (i.e., evolving in tandem with the problem).

So, what is the mathematical representation of the 'unknown mapping function' of Wicked Problems, the 'solution' that neural networks target? It's a dynamic convolution of applied differential equations, convex optimization (i.e., gradient descent), and set theory. The 'unknown mapping function' is never static; it's a moving target. The learning algorithm, which is an analog of the given-reciprocated neural network topology, has a set of solutions that it can select based on its overall impression of the 'unknown target function' at a given moment in time, 't.' At any given moment, the learning algorithm will have a set of solutions that approximate the 'unknown target function' at that time.

Within the approximation function set, the selection is ranked under the influential guidance of gradient descent. This means that the set approximating the 'unknown target function' is a set whose discrete elements are partial differential equations. The associated terms within each discrete element of the set, which approximates the 'unknown target function', change with time, not so much with regards to the factor itself, but with regards to the scalar contribution (i.e., the 'weight') of the factor in association with the generation of the overall-output of the discrete element-itself.

Albeit, gradient descent selects (i.e., ranks) the best solution optimization through the "averaging" of precision, accuracy, and time optimization (i.e., convex optimization) of each discrete element within the function-approximation set, at the moment, "t."

要查看或添加评论,请登录

Alexander Eul的更多文章

社区洞察

其他会员也浏览了