How to use a trained neural network to perform style transfer on a photo.
When people talk about 'AI' they're often talking about 'automation' or, more specifically, automation enabled through deep learning applications.
One very interesting, and fun, application of DL is to perform a style transfer from one visual artifact to another. In this example, I use Python and a variety of models, trained on the techniques of various paintings, to apply the specific techniques of a particular painting to a unique photograph that I provided.
The first painting I used was Rain Princess by Leonid Afremov, pictured in the masthead of this article. I then applied the style of a photograph of my daughter Zoe. You can see the original photo, and the post-transfer photo, below.
Original photo:
Post-transfer version:
Not bad, if a bit uncanny. Wanting to understand whether the style transfer would be consistent across different types of photos, I ran the script against two additional photos I had taken: a macro (close-up) shot of a cactus, and a telephoto capture of an American flag. Output shown below.
Image of cactus*, as originally shot:
*I'm not actually sure this is a cactus; it's some variety of succulent planted next to other plants that are clearly cactii, so I'm calling it a cactus.
Same photo, post Rain Princess:
Image of flag, as originally shot:
Same flag, post-transfer:
Consistency of the model
The application of the style seems consistent in all three images, with the allowance that the cactus image had a tiny depth of field and thus full background blur (bokeh) going on, whereas the flag and portrait are sharp all the way through. This transfer consistency is a good sign, because it means the model was well trained.
Revisiting the cactus
Because the image of the cactus has the oddest photographic characteristics of the three -- it was taken with a 200mm macro lens, extremely close up, at an aperture of f/4.0 with serious background blur and almost nothing in focus -- I wanted to see what would happen if I applied three other painting styles to it, specifically:
Picasso's La Muse:
Picabia's Udnie:
Hokusai's, The Great Wave off Kanagawa:
The results of the cactus experiment...
...were interesting. Of the three, the Picasso transfer looks the most 'painterly' and the Hokusai the most photo-realistic (to me). The Picasso model filled in the blur with tiny polygons, almost like stained glass, which gives the whole image a slightly avant-garde neo-Gothic cathedral vibe.
The Hokusai version looks almost like a pressed glass microscope photo you might find in a biochemistry textbook, illustrating some organic process, like photosynthesis, say, or cell death. The Picabia version freaked me out bit, like some hairy alien vegetable with bad intentions (we all know those type).
Below are the three outputs, in the same order (Picasso -- Picabia -- Hokusai). Shown in high-res for detail.
Picasso:
Picabia:
Hokusai:
Takeaways (in brief):
- Style transfer can be thought of a very advanced version of an Instagram filter. Whereas a particular filter will apply the same rules to every image it encounters (Juno = 'increase reds, decrease exposure, darken edges'), a style transfer applies thousands of slightly variable modifications to every image uniquely. To give a sense of the difference, measured in processing time: on a 2018 MacBook Pro with a 2.6 GHz Intel Core i7 processor, applying an Instagram filter to a 1 MB photo takes about 100 milliseconds from start to finish (one-tenth of a second). Operating on the same 1 MB photo file, the application of the Rain Princess style filter takes about 5 minutes, start to finish (during which time your laptop's CPU will heat up, your fan will go into overdrive, and you'll wonder if you should've sprung for the 16-core iMac Pro).
- It is important to test that the model (also called 'checkpoint') you want to use for a particular transfer -- particularly if you are using it in a commercial setting, such as at an advertising or design agency -- before committing to it. The only way to do this is to use a highly variable sample of images for testing, with different subject, color, lighting, and focal attributes.
- To increase consistency of output, transfer from like-to-like. In other words, if you're working with portrait photographs, use paintings that are also portraits for your style transfer. Same with landscapes and physical objects. Otherwise, you may end up with some strange (and occasionally hilarious) final images.
Technical notes
I used the following software packages and frameworks to create the images in this article, all of which I installed via command line on the aforementioned MacBook Pro.
- Python 3
- TensorFlow (Google's open-source ML platform)
- Pillow (Python imaging library)
- SciPy (science and mathematics library for Python)
- Anaconda (popular data science distribution for Python)
- The convolutional neural network by lengstrom available in this GitHub repo
- Pre-created checkpoint files
If you'd like installation instructions or setup details for any of the above, ping me.