Variational autoencoders (VAEs)
Introduction :
A particular class of neural network called an autoencoder is made to learn effective ways to represent input data. They are made up of an encoder that maps the input data to a latent space and a decoder that reverses the mapping. In order to reduce the discrepancy between the input data and the reconstructed data, the encoder and decoder are often trained together.
The issue of feature engineering, or the act of manually choosing and modifying input features to enhance the performance of a machine-learning model, was addressed by the development of autoencoders. Due to the need for domain expertise and the potential for engineer bias, feature engineering can be time-consuming and error-prone. In order to automate the feature engineering process, autoencoders were created, which learn effective representations of the input data straight from the data.
Autoencoders vs Variational autoencoders :
The use of probabilistic modeling distinguishes variational autoencoders (VAEs) from regular autoencoders. Conventional autoencoders are purely deterministic and strive to precisely reconstruct the input data. The latent variables that best explain the input data are learned from a probability distribution by VAEs, who then sample new data from that distribution.
The latent variables are specifically modeled by VAEs as being produced by a prior distribution, often a regular Gaussian distribution. The decoder maps from the latent space to the output data space from where the encoder maps the input data to a distribution in the latent space. VAEs minimize a loss function during training that contains a regularization term and a reconstruction loss term (testing the precision of the input data's reconstruction) (measuring the difference between the learned latent variable distribution and the prior distribution).
For tasks like data synthesis and data augmentation, the probabilistic modeling in VAEs enables the generation of new data by sampling from the learned latent variable distribution. Additionally, it has been demonstrated that VAEs work well for learning representations of the input data that are decoupled from one another, with each dimension of the latent space corresponding to a particular source of variation in the input data.
In the context of machine learning, probabilistic modeling is used to build models that can handle uncertain or incomplete data. For example, in image recognition, it is common for an image to be occluded or partially obscured, and probabilistic models can be used to account for this variability and still recognize the image accurately.
VAE architecture :
领英推荐
VAEs applications:
In each of these applications, the VAE is trained on a dataset of input data. During training, the VAE learns to encode the input data into a lower-dimensional latent space, and then decode the latent space representation to reconstruct the original data. Once the VAE has been trained, it can be used to perform the desired task, such as generating new images, compressing data, or denoising data.
VAE loss function :
The loss function used in training a VAE is composed of two parts: the reconstruction loss and the regularization loss.
The reconstruction loss measures the difference between the input and output of the VAE, while the regularization loss is used to ensure that the encoder output is close to a unit Gaussian distribution. The regularization loss is typically computed using the Kullback-Leibler (KL) divergence between the encoder output and the target distribution.
The optimization techniques used in VAE training are similar to those used in other neural network architectures, such as stochastic gradient descent (SGD) and its variants. However, due to the probabilistic nature of VAEs, other techniques such as the reparameterization trick and importance sampling are also used to improve training stability and convergence. Additionally, techniques such as early stopping and learning rate schedules can also be used to optimize VAE training.
The Reparameterization trick :
The reparameterization trick is a method employed by variational autoencoders (VAEs) to enable backpropagation training of the model. The encoder creates a distribution of latent variables in the VAE, and the decoder uses that distribution to create a sample. The trick is to isolate the randomness from the model parameters by reformulating the stochastic process that created the latent variables. As a result, it is possible to employ stochastic gradient descent to optimize the model by propagating the gradients across it. The reparameterization approach specifically entails sampling from a parameterized distribution that is a deterministic function of the encoder's output and a sample from a fixed distribution in place of directly sampling from the encoder's output (e.g. a standard normal distribution). The reparameterization trick is key to the successful training of VAEs and has become a standard technique in deep generative models.
VAEs and other generative models :
Variational autoencoders (VAEs) are one type of generative model, which are used to learn the underlying structure of a dataset and generate new samples that resemble the original data. While VAEs and generative adversarial networks (GANs) are both generative models, they differ in their approach to learning the latent space. GANs use a two-part network consisting of a generator and a discriminator, while VAEs use a single network that learns to encode and decode the data.
Another type of generative model is the autoregressive model, which is based on predicting the probability of each individual element in the data based on the previous elements. Unlike VAEs, which learn a compressed representation of the data, autoregressive models preserve the structure of the data and can be used for tasks such as text generation.
In summary, while VAEs, GANs, and autoregressive models are all generative models, they differ in their approach to learning the underlying structure of the data and generating new samples. The choice of which model to use depends on the specific task and the characteristics of the data being modeled.
Student
1 年Nice