WHAT IS DEEP LEARNING

WHAT IS DEEP LEARNING

Deep learning?is part of a broader family of?machine learning ?methods, which is based on?artificial neural networks ?with?representation learning . Learning can be?supervised ,?semi-supervised ?or?unsupervised .[2]

Deep-learning architectures such as?deep neural networks ,?deep belief networks ,?deep reinforcement learning ,?recurrent neural networks ,?convolutional neural networks ?and?transformers ?have been applied to fields including?computer vision ,?speech recognition ,?natural language processing ,?machine translation ,?bioinformatics ,?drug design ,?medical image analysis ,?climate science , material inspection and?board game ?programs, where they have produced results comparable to and in some cases surpassing human expert performance.[3] [4] [5]

Artificial neural networks ?(ANNs) were inspired by information processing and distributed communication nodes in?biological systems . ANNs have various differences from biological?brains . Specifically, artificial neural networks tend to be static and symbolic, while the biological brain of most living organisms is dynamic (plastic) and analog.[6] [7]

The adjective "deep" in deep learning refers to the use of multiple layers in the network. Early work showed that a linear?perceptron ?cannot be a universal classifier, but that a network with a nonpolynomial activation function with one hidden layer of unbounded width can. Deep learning is a modern variation that is concerned with an unbounded number of layers of bounded size, which permits practical application and optimized implementation, while retaining theoretical universality under mild conditions. In deep learning the layers are also permitted to be heterogeneous and to deviate widely from biologically informed?connectionist ?models, for the sake of efficiency, trainability and understandability.

Definition[edit ]

Deep learning is a class of?machine learning ?algorithms ?that[8] : 199–200 ?uses multiple layers to progressively extract higher-level features from the raw input. For example, in?image processing , lower layers may identify edges, while higher layers may identify the concepts relevant to a human such as digits or letters or faces.

From another angle to view deep learning, deep learning refers to ‘computer-simulate’ or ‘automate’ human learning processes from a source (e.g., an image of dogs) to a learned object (dogs). Therefore, a notion coined as “deeper” learning or “deepest” learning?[9] ?makes sense. The deepest learning refers to the fully automatic learning from a source to a final learned object. A deeper learning thus refers to a mixed learning process: a human learning process from a source to a learned semi-object, followed by a computer learning process from the human learned semi-object to a final learned object.

Overview[edit ]

Most modern deep learning models are based on multi-layered?artificial neural networks ?such as?convolutional neural networks ?and?transformers , although they can also include?propositional formulas ?or latent variables organized layer-wise in deep?generative models ?such as the nodes in?deep belief networks ?and deep?Boltzmann machines .[10]

In deep learning, each level learns to transform its input data into a slightly more abstract and composite representation. In an image recognition application, the raw input may be a?matrix ?of pixels; the first representational layer may abstract the pixels and encode edges; the second layer may compose and encode arrangements of edges; the third layer may encode a nose and eyes; and the fourth layer may recognize that the image contains a face. Importantly, a deep learning process can learn which features to optimally place in which level?on its own. This does not eliminate the need for hand-tuning; for example, varying numbers of layers and layer sizes can provide different degrees of abstraction.[11] [12]

The word "deep" in "deep learning" refers to the number of layers through which the data is transformed. More precisely, deep learning systems have a substantial?credit assignment path?(CAP) depth. The CAP is the chain of transformations from input to output. CAPs describe potentially causal connections between input and output. For a?feedforward neural network , the depth of the CAPs is that of the network and is the number of hidden layers plus one (as the output layer is also parameterized). For?recurrent neural networks , in which a signal may propagate through a layer more than once, the CAP depth is potentially unlimited.[13] ?No universally agreed-upon threshold of depth divides shallow learning from deep learning, but most researchers agree that deep learning involves CAP depth higher than 2. CAP of depth 2 has been shown to be a universal approximator in the sense that it can emulate any function.[14] ?Beyond that, more layers do not add to the function approximator ability of the network. Deep models (CAP > 2) are able to extract better features than shallow models and hence, extra layers help in learning the features effectively.

Deep learning architectures can be constructed with a?greedy ?layer-by-layer method.[15] ?Deep learning helps to disentangle these abstractions and pick out which features improve performance.[11]

For?supervised learning ?tasks, deep learning methods eliminate?feature engineering , by translating the data into compact intermediate representations akin to?principal components , and derive layered structures that remove redundancy in representation.

Deep learning algorithms can be applied to unsupervised learning tasks. This is an important benefit because unlabeled data are more abundant than the labeled data. Examples of deep structures that can be trained in an unsupervised manner are?deep belief networks .[11] [16]

Interpretations[edit ]

Deep neural networks are generally interpreted in terms of the?universal approximation theorem [17] [18] [19] [20] [21] ?or?probabilistic inference .[22] [8] [11] [13] [23]

The classic universal approximation theorem concerns the capacity of?feedforward neural networks ?with a single hidden layer of finite size to approximate?continuous functions .[17] [18] [19] [20] ?In 1989, the first proof was published by?George Cybenko ?for?sigmoid ?activation functions[17] ?and was generalised to feed-forward multi-layer architectures in 1991 by Kurt Hornik.[18] ?Recent work also showed that universal approximation also holds for non-bounded activation functions such as?Kunihiko Fukushima 's rectified linear unit.[24] [25]

The universal approximation theorem for?deep neural networks ?concerns the capacity of networks with bounded width but the depth is allowed to grow. Lu et al.[21] ?proved that if the width of a?deep neural network ?with?ReLU ?activation is strictly larger than the input dimension, then the network can approximate any?Lebesgue integrable function ; If the width is smaller or equal to the input dimension, then a?deep neural network ?is not a universal approximator.

The?probabilistic ?interpretation[23] ?derives from the field of?machine learning . It features inference,[8] [10] [11] [13] [16] [23] ?as well as the?optimization ?concepts of?training ?and?testing , related to fitting and?generalization , respectively. More specifically, the probabilistic interpretation considers the activation nonlinearity as a?cumulative distribution function .[23] ?The probabilistic interpretation led to the introduction of?dropout ?as?regularizer ?in neural networks. The probabilistic interpretation was introduced by researchers including?Hopfield ,?Widrow ?and?Narendra ?and popularized in surveys such as the one by?Bishop .[26]

History[edit ]

There are two types of neural networks:?feedforward neural networks ?(FNNs) and?recurrent neural networks ?(RNNs). RNNs have cycles in their connectivity structure, FNNs don't. In the 1920s,?Wilhelm Lenz ?and?Ernst Ising ?created and analyzed the?Ising model [27] ?which is essentially a non-learning RNN architecture consisting of neuron-like threshold elements. In 1972,?Shun'ichi Amari ?made this architecture adaptive.[28] [29] ?His learning RNN was popularised by?John Hopfield

要查看或添加评论,请登录

社区洞察

其他会员也浏览了