登录查看更多内容

Day 2/60 Reviewing AI & Machine Learning: Generative Deep Learning

Tom Zhang

Product Engineer | Full-Stack Developer | Technical Author | AI/ML, Web Development, Electrical Engineering

发布日期: 2024年8月6日

Deep Learning

A branch of machine learning that replicates the way our human brains learn. Most of deep learning systems are neural networks. These systems contain multiple stacked hidden layers. Any system in machine learning or AI that employs many layers to learn a high level representation of input data is also considered to be part of deep learning (e.g. deep Boltzmann machines)

A neural network consists of an input layer, where the input data gets passed in and an output layer where the output is returned. Layers in between the input and output layer are referred to as “hidden” layers. These layers can store increasingly more advanced and sophisticated aspects of the original input.

The goal for a neural network involves finding the optimal set of weights for each layer that will make accurate predictions. This can be achieved by training the neural network with a suitable dataset. When we initialize a neural network, the weights/biases are randomized and get increasingly optimized to our desired values that will make accurate predictions, the more we train them.

Here is a list of projects you can try:

A neural network that can recognize handwritten digits (use the MNIST dataset for training and testing)
A neural network that can classify 10 common objects we see in our everyday lives (use the CIFAR-10 dataset for training and testing)

The core principle of machine learning is to ensure the model generalizes to unseen data rather than simply remembering the training dataset. This is the same as a student who simply memorizes physics formulas instead of truly understanding them.

Dropout layers in a neural network have been widely used to help neural networks achieve this. It is very simple. Each dropout layer within a neural network chooses a random set of inputs from the preceding layer and sets them to zero.

The addition of dropout layers drastically reduces overfitting the model, by ensuring that it does not become over-reliant from its training set.

Dropout layers are used the most commonly after fully connected layers as they are the most prone to overfitting due to their large amounts of its weights.

Batch normalization is also used to reduce overfitting so most modern machine learning applications solely use it for regularization. However, there is no set rule for regularization as different techniques can be applied to different situations.

Autoencoders in Generative AI:

Autoencoder: a neural network made up of two parts, an encoder and a decoder

Encoder - compresses input data into representational vectors

Decoder - decompresses a given representation vector back to the form in the original input

Therefore, an autoencoder aims to minimize the loss between the original input and the reconstruction of the input after its trip through the neural network.

Let’s say you have a picture of a flower.

Picture → input
Encoder processes this picture and reduces it to a simpler form, capturing the most important features (like edges, shapes, etc.). This simpler form is called the latent space or bottleneck.
Latent Space: This is a compact version of the original picture that still contains the essential information.
Decoder: The decoder takes this latent space representation and tries to reconstruct the original picture of the cat.

By minimizing the difference between the original input and the reconstructed output, we can fine-tune our model. This allows us to generate complex images from the latent space, which has been optimized based on our training dataset or input data.

Generative Adversarial Networks (GANs)

Generative Adversarial Network (GAN) is a type of neural network architecture used in machine learning, specifically in the field of generative modeling. They consist of two main parts:

领英推荐

The Many Faces of AI: A Comprehensive Breakdown of…

Anand Padia 1 年前

Essential Concepts From Little Book of Deep Learning

Egecan Dursun 1 年前

What Are The Mechanics Of AI

Steve Fouracre 4 个月前

Generator
Discriminator

Generator: creates new data that looks similar to real data. For example, if we are working with images, the generator will try to create new images that look like the real ones it has seen during training.

Discriminator: looks at data and determine if it is real (from the training dataset) or fake (created by the generator). It acts like a detective trying to catch counterfeit currency.

Training Process

The generator creates fake data and sends it to the discriminator.
The discriminator evaluates the data and decides if it is real or fake.

Based on the discriminator's feedback, the generator tries to improve its fake data to fool the discriminator next time.

The discriminator also keeps improving to better distinguish between real and fake data.

This process gets repeat to improve the generated data (e.g. images, text, etc) until the discriminator no longer can accurately distinguish between fake data and real data.

GAN challenges:

Oscillating loss: occurs when the generator and discriminator keep improving, but they don't stabilize, making training very unpredictable.
Mode collapse: occurs when the generator keeps creating the same kind of fake data over and over again, ignoring the variety in real data.
Uninformative loss: occurs when the feedback from the discriminator to the generator is not helpful, making it hard for the generator to improve.
Hyperparameters: the settings you choose before starting the training process. They affect how the GAN learn and performs. However, unfit hyperparameters can negatively affect the desired performance of your GAN.

Tackling the GAN challenges:

Wasserstein loss: a meaningful loss metric that is related to the generator’s convergence and sample quality. In other words it is aimed at improving the stability of the training process and reduce the oscillating loss
Lipschitz constraint: a constraint put into place that keeps the discriminator’s output from changing too rapidly or wildly. It ensures that the changes are smooth and gradual.
Weight clipping: limits the values of the weights of a discriminator to a certain range to keep the model’s performance under control
Gradient penalty loss: encourages smooth and consistent changes in the discriminator by penalizing it if the changes are too extreme

How do machines draw, compose, and write?

Draw:

CycleGAN: A type of GAN that can transform images from one domain to another

It uses two generators and two discriminators. One generator learns to transform images from Domain A to Domain B, and another generator learns to do the reverse. Cycle consistency loss ensures that if you convert an image to another domain and back, you get the original image.

Neural Style Transfer: takes two images—a content image and a style image—and blends them so that the output image looks like the content image but in the style of the style image.

It uses a convolutional neural network (CNN) to perform feature-extraction from both images. Then minimizes a loss function that combines the content loss (difference in content features) and style loss (difference in style features) to generate the final image.

Modern-day models such as Stable Diffusion and DALL-E have indeed gained significant attention and popularity, often eclipsing earlier techniques like CycleGAN and Neural Style Transfer.

Compose amp; Write:

Recurrent Neural Networks (RNNs): great for sequential data and can remember previous inputs in the sequence. They are useful for tasks like composing music or writing text. This type of model is more suitable for generating monophonic music, with one single melody line (e.g. a solo singer humming with out any additional harmony or other notes played together). Nowadays, other models are great for generating polyphonic music which I do not have much knowledge about.

Long Short-Term Memory Networks (LSTMs): A type of RNN designed to remember information for long periods, making them better at handling long sequences of data.

Transformers: handle sequential data without the limitations of RNNs. They use a mechanism called self-attention to weigh the importance of a choice of words in a sequence.

Generative Pre-trained Transformers (GPTs): A specific type of transformer model designed for generating human-like text. It’s trained on large datasets of text and can produce coherent and contextually relevant sentences.

Let me know your thoughts in the comment below. See you soon in day 3!

要查看或添加评论，请登录

Tom Zhang的更多文章

What I Have Read This Week...

2024年9月1日

What I Have Read This Week...

Caught my eye… The letter from Meta to Chairman Jim Jordan addresses the company's past interactions with the U.S.
What I Read This Week...

2024年8月25日

What I Read This Week...

Aug 19 - 25, 2024 News and Events YouTube’s new thumbnail feature? Testing 3 thumbnails for two weeks: I did not even…
Blockchain Is Not a Block with Chains on It: Demystified with a Simple Analogy

2024年8月10日

Blockchain Is Not a Block with Chains on It: Demystified with a Simple Analogy

Blockchain is not a “block with chains on it“. It is only understandable for a beginner to know that.
Day 5/60 Reviewing AI & Machine Learning: AI & ML for On-Device Development

2024年8月8日

Day 5/60 Reviewing AI & Machine Learning: AI & ML for On-Device Development

AI & ML for On-Device Development Has developing AI for a mobile application been a challenge for you? Let me introduce…
Day 4/60 Reviewing AI & Machine Learning: Machine Learning Powered Apps

2024年8月6日

Day 4/60 Reviewing AI & Machine Learning: Machine Learning Powered Apps

Today we dive into the field of applications powered by machine learning. ML products are more than just training a…
Day 3/60 Reviewing AI & Machine Learning: Blockchain-Tethered AI

2024年8月6日

Day 3/60 Reviewing AI & Machine Learning: Blockchain-Tethered AI

Why does AI need to be tethered? The benefits of AI is vast but if we are not careful, AI could destroy humanity. How…
Day 1/60 Reviewing AI & Machine Learning: TensorFlow.js

2024年8月6日

Day 1/60 Reviewing AI & Machine Learning: TensorFlow.js

Key Advantages of TensorFlow.js Edge Device Compatibility Versatile Deployment Enhanced Privacy Types of Machine…

See all articles

Day 2/60 Reviewing AI & Machine Learning: Generative Deep Learning

Tom Zhang

Product Engineer | Full-Stack Developer | Technical Author | AI/ML, Web Development, Electrical Engineering

Deep Learning

Autoencoders in Generative AI:

Generative Adversarial Networks (GANs)

领英推荐

Training Process

GAN challenges:

Tackling the GAN challenges:

How do machines draw, compose, and write?

Draw:

Compose amp; Write:

Tom Zhang的更多文章

社区洞察

其他会员也浏览了

AI and The Importance of Choosing the Right Deep Learning Approach

Deep Learning 101: Understanding the Magic Behind the Robot's Skills

What is the role of AI in our life?

What is "deep learning" with respect to AI and open data context

Deep Diving into Deep Learning: A Corporate Treasurer's Swim Among Neural Networks

Getting Started with AI

Unleashing the Power of Deep Learning: Transforming Industries with Intelligent Solutions

Deep learning vs Neural Network

Deep Learning Simplified

A Primer On Deep Learning

Deep Learning

Autoencoders in Generative AI:

Generative Adversarial Networks (GANs)

领英推荐

Training Process

GAN challenges:

Tackling the GAN challenges:

How do machines draw, compose, and write?

Draw:

Compose amp; Write:

Tom Zhang的更多文章

What I Have Read This Week...

What I Read This Week...

Blockchain Is Not a Block with Chains on It: Demystified with a Simple Analogy

Day 5/60 Reviewing AI & Machine Learning: AI & ML for On-Device Development

Day 4/60 Reviewing AI & Machine Learning: Machine Learning Powered Apps

Day 3/60 Reviewing AI & Machine Learning: Blockchain-Tethered AI

Day 1/60 Reviewing AI & Machine Learning: TensorFlow.js

社区洞察

其他会员也浏览了

AI and The Importance of Choosing the Right Deep Learning Approach

Deep Learning 101: Understanding the Magic Behind the Robot's Skills

What is the role of AI in our life?

What is "deep learning" with respect to AI and open data context

Deep Diving into Deep Learning: A Corporate Treasurer's Swim Among Neural Networks

Getting Started with AI

Unleashing the Power of Deep Learning: Transforming Industries with Intelligent Solutions

Deep learning vs Neural Network

Deep Learning Simplified

A Primer On Deep Learning