登录查看更多内容

Neural Style Transfer: Online Image Optimization (Flexible but Slow)

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

发布日期: 2018年6月30日

In this article, we demonstrate the power of Deep Learning, Convolutional Neural Networks (CNN) in creating artistic images via a process called Neural Style Transfer (NST). Currently, NST is well-known and a trending topic both in academic literature and industrial applications. Broadly speaking, NST can be divided into two main paradigms:

Online image optimization (discussed in this article)
Offline Network optimization

In this article, we focus on the first point discussing the main papers as a survey.

Online image optimization: Overview

The main idea is to iteratively optimizing a random image, not a network, and keep changing the image in the direction of minimizing some loss. The iterative optimization process is based on gradient descent in the image space.

In this paper "Understanding Deep Image Representations by Inverting Them", the loss is defined as a simple Euclidean distance between the activations of the network based on the input and the equivalent activations of a reference image, in addition to a regularizer such as the Total Variance.

The figure above shows five possible reconstructions of the reference image obtained from the 1,000 dimensional code (vector) extracted at the VGG network trained on ImageNet.

All these five generated images produce almost the same vector of length 1000 that the original image produce. In other words, from the model's viewpoint , all these images are almost equivalent.

Example 1: Reconstruction of Images based on Content and Style

In the well-known work “Image Style Transfer Using Convolutional Neural Networks”, a new image can be constructed, through iterative optimization process in the image space, by having a loss that balances between two components, one for the content and the other for the style.

As discussed here, the content is usually given by activations of high layers and one way to capture the style is capturing the correlation of feature maps in different layers. In this setup, the goal is to generate an image that minimizes the difference between weighted content loss plus style loss.

Example 2: Reconstruction of Images using different statistical style representation

There is another statistical style representation proposed in this paper “Demystifying neural style transfer”, where it was proved that matching the Gram matrices (proposed in the second example) is equivalent to a specific Maximum Mean Discrepancy (MMD) process. The style information is intrinsically represented by the distributions of activations in a CNN. Accordingly, ]the style transfer can be achieved by distribution alignment. Moreover, they showed several other distribution alignment methods, and find that these methods all yield promising transfer results.

In the figure above, style reconstructions of different methods in five layers. Each row corresponds to one method and the reconstruction results are obtained by only using the style loss. In each column, different style representations are reconstructed using different subsets of layers of VGG network.

Example 3: Reconstruction of Images while preserving the Coherence

The CNN features unavoidably lose some low level information contained in the image, which make the generated images distorted and look as irregular. To preserve the coherence structures, it was proposed in “Laplacian-steered neural style transfer” to add more constrains for low level features in pixel space. Basically, Laplacian filter computes the second order derivatives of the pixels in an image and is widely used for edge detection. In their work, Laplacian loss was added, which is defined as the squared Euclidean distance between the Laplacian filter responses of a content image and stylized result.

As shown in the figure above, The Laplacian loss is defined as the mean-squared distance between the two Laplacians. Minimizing this loss drives the stylized image to have similar detail structures as the content image. and also rendered in the new style.

Finally, Deep Dreaming, can be seen as another online optimization image generation, based on input image and what the used network is trained on.

Final Note:

The online image optimization discussed here, is based on online iterative optimization process through gradient descent, applied in the image space. Accordingly, the process is time consuming especially when the desired reconstructed image is large or when having large number of images to generate. In the next article a much faster method, Offline network optimization, is discussed.

Regards

Moamen Abdelrazek

AI Engineering Manager

6 年

Very useful (Y)

1 次回应

Eslam Ali

Software Engineer II at Booking

6 年

very nice paper i think they applied the same strategy in?Leon A. Gatys paper -> Gatys et al., 2015. A neural algorithm of artistic style. Images on slide generated by Justin Johnson

1 次回应

查看更多评论

要查看或添加评论，请登录

Ibrahim Sobh - PhD的更多文章

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

2025年3月1日

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

Article created by Perplexity Deep Research. Prompt: "You are a deep-learning experienced researcher.

1 条评论
The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

2025年3月1日

The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

Research Report Created by Perplexity Deep Research My Research Question : "Now I want to dig deeper in the human judge…

3 条评论
How to Learn Artificial Intelligence: A Beginner’s Guide

2024年5月31日

How to Learn Artificial Intelligence: A Beginner’s Guide

Artificial Intelligence (AI) is a fascinating field that simulates human intelligence and task performance using…
[????????????] ?????????????????? ???????????? explained with code ??

2023年1月28日

[????????????] ?????????????????? ???????????? explained with code ??

"During the last two years there has been a plethora of large generative models such as ChatGPT or Stable Diffusion…

2 条评论
A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

2023年1月21日

A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

Hello everyone, and thank you all for being here today! Let me introduce our new star, the ChatGPT, who will discuss…
10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

2022年2月17日

10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

In this article, 10 well-known pre-trained object detectors are loaded and used in a standard and easy way. YOLOF: You…

6 条评论
FNet: Do we need the attention layer at all? [Explained with code]

2021年10月30日

FNet: Do we need the attention layer at all? [Explained with code]

FNet: Mixing Tokens with Fourier Transforms "In this work, we investigate whether simpler token mixing mechanisms can…
Patches Are All You Need! [with code]

2021年10月28日

Patches Are All You Need! [with code]

"It is only a matter of time before Transformers become the dominant architecture for vision domains, just as they have…
MLP is all you need! [with code]

2021年10月23日

MLP is all you need! [with code]

From Google: MLP-Mixer: An all-MLP Architecture for Vision Main idea: "While convolutions and attention are both…

2 条评论
9 Steps for solving any machine learning problem

2021年8月28日

9 Steps for solving any machine learning problem

In this article, we will present a universal blueprint that we can use to attack and solve any machine-learning…

2 条评论

See all articles

Neural Style Transfer: Online Image Optimization (Flexible but Slow)

Ibrahim Sobh - PhD

?? Senior Expert of Artificial Intelligence, Valeo Group | LinkedIn Top Voice | Machine Learning | Deep Learning | Data Science | Computer Vision | NLP | Developer | Researcher | Lecturer

Online image optimization: Overview

Final Note:

Ibrahim Sobh - PhD的更多文章

社区洞察

其他会员也浏览了

BxD Primer Series: Echo State Neural Networks

Navigating the GenAI Frontier: Transformers, GPT, and the Path to Accelerated Innovation

What is Convolutional Neural Network — CNN (Deep Learning)

AI: Taking A Peek Under The Hood. Part 2, Creating a Two-Layer Neural Network

Convolutional Neural Networks (CNNs): A Simplified Explanation

Convolutional Neural Network — CNN (Deep Learning)

Artificial Intelligence in Healthcare : Algorithm 42 - Deep Q-Networks (DQN)

?? Understanding Convolutional Neural Networks (CNNs): The Backbone of Visual AI

Creating a Convolutional Neural Network from Scratch

Online image optimization: Overview

Final Note:

Ibrahim Sobh - PhD的更多文章

The Evolution and Applications of Attention Mechanisms in Deep Learning: A Comprehensive Survey

The Judicial Cognitive Process: From Case Inception to Judgment and the Promise of AI Augmentation

How to Learn Artificial Intelligence: A Beginner’s Guide

[????????????] ?????????????????? ???????????? explained with code ??

A conversation with ChatGPT about AI, study roadmap, applications, interview questions with answers, salaries, and more!

10 Object detectors with code [YOLOF, YOLOX, DETR, Deformable DETR, SparseR-CNN, VarifocalNet, PAA, SABL, ATSS, Double Heads]

FNet: Do we need the attention layer at all? [Explained with code]

Patches Are All You Need! [with code]

MLP is all you need! [with code]

9 Steps for solving any machine learning problem

社区洞察

其他会员也浏览了

BxD Primer Series: Echo State Neural Networks

Navigating the GenAI Frontier: Transformers, GPT, and the Path to Accelerated Innovation

What is Convolutional Neural Network — CNN (Deep Learning)

AI: Taking A Peek Under The Hood. Part 2, Creating a Two-Layer Neural Network

Convolutional Neural Networks (CNNs): A Simplified Explanation

Convolutional Neural Network — CNN (Deep Learning)

Artificial Intelligence in Healthcare : Algorithm 42 - Deep Q-Networks (DQN)

?? Understanding Convolutional Neural Networks (CNNs): The Backbone of Visual AI

Creating a Convolutional Neural Network from Scratch