Artistic Style Transfer

Artistic Style Transfer

With apps such as Prisma becoming viral, we have been hearing a lot about Artistic Style Transfer and its offshoots lately. It all started in Aug 2015 when Gatys et al. published an awesome paper on how it was actually possible to transfer artistic style from one painting to another picture. We at Dittory, were especially eager to explore its uses to enhance product discovery in the e-commerce space.

When we look at a painting, we instinctively understand the commonality in an artist’s brush-strokes,colours and styles used.Who would have thought that a mathematical representation of such an abstract concept such as artistic style was possible,but using the features generated by a deep learning neural network,it was found that a gram-matrix(a matrix comprising of correlated features) does capture style to a great extent.

Using a pre-trained neural network such as VGG-19, an input image (ie. an image which provides the content), a style image (a painting with strong style elements) and a random image (output image), one could minimise the losses in the network such that the style loss (loss between the output image style and style of ‘style image’), content loss (loss between the content image and the output image) and the total variation loss (which ensured pixel wise smoothness) were at a minimum. In such cases, the output image generated from such a network, resembled the input image and had the stylist attributes of the style image.

Output Image (Taj Mahal with style of Van Gogh’sStarry Nights )

  Shahrukh Khan (with Style of Lundstr?m )

Tweaking the weights of these losses and introducing slight changes in the cost function, lead to interesting variations in the output image. Masking a part of the image and transferring style only to specific components of the image is also a very useful use case. Using a simple content mask, it is possible to minimise the style loss within the masked portion alone by nullifying the network gradients outside the mask. Another method is to simply superimpose the non-stylised portions of the image back into the output image.

Shahrukh Khan with style of Ben Giles (Masked transfer)

Generating images which have only the style transferred while retaining the original colour is one more variation. Another interesting experiment was to use style blends where styles from different artists could be blended and a completely new style created.

Taj Mahal Styled with a blend of Milind Mullick’s watercolors and Picasso’s La Muse

However, since these optimisation methods always needed to take both the content and style as inputs and took some time to process, they came to be called as the “slow” approach.Many of the apps/applications available now, however generate the predictions or stylised images in real time.And this was made possible using an approach of the fast neural network by Johnson et al. where a separate feed-forward neural network is trained to minimise the loss for one style image and a batch of images. This also meant that one network would be needed for each style. While experimenting with the Tensorflow implementations of the same, we found that it does give appreciable results but is still fraught with a lot of fine tuning, that’s required for generating an image which is close to what the optimisation methods gave. Moreover, one realises that a lot of artistic sensibility is required to even figure out what makes a good style image.

Deep photo realism is another very promising offshoot of this, where two photos can be used to transfer the feel of the photo from one to another. The overall concept is nearly the same as the Gaty’s optimisation approach with the only difference being able to selectively transfer styles to image segments based on the broad labels and being able to optimise on generating more realistic outlines and shapes of images.

Courtsey: https://github.com/luanfujun/deep-photo-styletransfer

There has been increased interest in using other networks such as Generative Adversarial Networks (GAN),though the results currently don’t look as appealing as the ones generated by the optimisation approach. However,advancements in this are still happening as a more recent DiscoGAN (Discovery GANs) looks much more exciting and leads to interesting cross-domain style transfers.

Audio/Music style transfers have already made some progress and several more use cases pertaining to unique human tasks like the style of playing chess etc. are also being explored, using more generalised frameworks of style transfer.

The debate whether machines can truly generate unique styles and whether those styles deserve the same place as human generated art continues, but if we were to leave aside that debate, the marvel of finding maths hidden in the most mundane and the most creative things we do, is truly amazing.


Pradeep Bahirwani

Principal at Pedersen & Partners

7 年

OMG: Science stepping on the toe of Art....

回复

要查看或添加评论,请登录

Asha Vishwanathan的更多文章

  • GANimation?-?Facial Animation

    GANimation?-?Facial Animation

    Ganimation was an interesting paper that came out in 2018 where the authors successfully changed facial expressions by…

    10 条评论
  • The Power of Attention

    The Power of Attention

    What if you saw these images and the corresponding captions. What if I were to tell you , that these were not generated…

    1 条评论
  • The Art of Finding Patterns

    The Art of Finding Patterns

    Prophecies have existed and have continued to enamor us since the ancient and the medieval period. Humans have always…

社区洞察

其他会员也浏览了