Generating a Custom Headshot with 6 AI Tools
I'm sure most of us can relate - you've had some headshots taken at work, and while they're great - you're sure they could be more interesting. Maybe even more 'you'? After my latest spin in front of the photographer I was left wondering what I might end up with if I were able to really indulge in some tinkering. With all the buzz around generative AI at the moment, I decided to dive on in!
There are apps of course - like Lensa's Magic Avatars that blew up not so long ago - though these are paid, you have to send your photos offsite, and you don't get granular control.
I'd heard from a colleague Davide La Sala that he'd been playing with DreamBooth and Stable Diffusion for a little more direction and control, and figured this was a good lead.
This set me off on a journey across 6 separate AI tools to generate a completely synthetic new headshot, turbocharging my productivity:
First Steps: Fact Finding
Where to begin? I found myself immediately dreading the task of having to wade through endless YouTube videos and articles trying to find the right starting point. And then of course I remembered: Bing Chat. I could just ask it all of my dumb questions.
After a quick little chat with Bing, I landed on a fantastic youtube tutorial by Dr. Furkan G?zükara on his youtube channel SETutorials.
Kicking off the tutorial
Dr. Furkan walks us through getting set up on Google Colab, prepping your images, duplicating his notebook, spinning up a virtual machine and beginning to execute some code.
This method enables you to train a custom model on your own photos using DreamBooth, and to then feed that model into Stable Diffusion to generate your images.
However, it wasn't long before the scripts in the notebook were throwing some errors. This method relies on packages like Torchvision, which are updated frequently. The tutorial and notebook were 5 months old by the time I picked them up, and there were some version compatibility issues. I'm not a hardcore coder and it would typically take me a long time to try to unpick error messages like these!
What to do? I remembered reading that Google Bard can be a great debugging assistant. By copying the error messages into Bard, it explained to me what the problem was and wrote code for me to execute back in Colab to fix it. Amazing!
(Fun note: it seems since I ran into this, the dependencies have been updated! ??)
Testing out the dataset
I assembled a rough set of 512 x 512 images of myself used to train the model, and after spending some time playing with different settings for image generation, I noticed some patterns emerged. For example, a lot of the generated images had brightly coloured geometric backgrounds regardless of what I prompted. I'd included a selfie from an AR Effect I'd developed for the late HM Queen Elizabeth's Jubilee, which seemed to be skewing the model:
I spent some time assembling a higher quality data set and re-training the model, paying particular attention to:
I immediately noticed the results - my hit rate of decent, interesting images went up.
领英推荐
Prompting
With more predictable results, I was able to get a better sense of prompting. Stable Diffusion includes super handy negative prompting, and as I explored I developed some notes:
I also developed a feel for the image generation settings and their influence over the result, specifically the Guidance Scale and Inference Steps. I had noticed at times the generated images seemed overly 'sharpened' - and found a super helpful demo on getimg.ai that helped me work out I was setting my Guidance Scale too high:
Somewhere beetween 7 and 8 seemed to be the sweet spot.
Settling on a result
I spent a long time really playing with this to get a feel for the process, while also discovering it can be addictive - like a poker machine, you want to keep pressing the button and seeing what kind of surprise you'll get. I pushed Stable Diffusion to give me a result that looked like a typical, high quality studio headshot with a slight quirky edge and not too serious. There were some artefacts that cropped up regularly with this method that I wanted to avoid:
In the end, I settled on this image:
For reference, my prompt was:
portrait photograph of danmoller, modern, jacquard bomber jacket, rainbow embroidery, backlight, realistic eyes, facing camera, high detail eyes, symmetry, color gradient background, professional photography, 8k, moody side lighting, friendly, happy, charming, nikon, 80mm
With a negative prompt:
ugly, deformed, hands, grumpy, angry, black and white, mutated, blurry, teeth, frame, artefact, glitch, error, street, side, suit, red, squashed
You'll notice that some of the keywords didn't land in the final result - e.g. "gradient background", and if you look carefully - you can see the irises are squashed, the lips have a strange wiggle and the hair has a bit of a 'game character' patterning to it. It's also 512 x 512 - so time to dive into some more AI tools!
Photoshop AI
I used superzoom in Photoshop's Neural Filters to upres the image, which then gave me an easier canvas to work with to manually fix the eyes and lips. I used the newly released Generative Fill to slightly outpaint and expand the canvas to give the image a little more headroom, tidy up the zippers and add a bit more detail to the hair:
After adding some vignette lighting to the background, film grain and lensing tricks I arrived at the final result:
Final thoughts
The debate is going to continue to rage between "this tech will steal all of our jobs" and "this tech is rocket fuel for productivity". After exploring 6 different AI tools throughout this project, I'm erring more on the productivity side. I could do this kind of thing before - but with these tools I can do it much faster and with higher quality results. Isn't this what happened when we moved from painting to photography? Yes, this work will take less and less labour, but if you keep yourself ahead of the curve isn't that a good thing? Isn't this what technology has always done?
Reducing the labour required to move from what we imagine to what we create is what we've always sought to do - not just in the creative industries but all industries.
Being able to dive into Bard to help me swiftly debug the code was a revelation - it's obvious that progress here will compound, and we should expect it to be exponential. We're building tools that will help us build more tools.
Then of course there are the ethical questions. Copyrights - this is for brains far smarter than mine to begin unpacking, and many of the tools used here are not for commercial use. Next - should I be using a synthetic headshot at all? AI tools are increasingly making their way into cameras and consumer photo editing apps - see Samsung's AI-Moon-Enhancing controversy, or Google's Magic Editor. At the same time, my original photographer-taken headshots were retouched in the studio before even being sent to me! We have to ask ourselves where we think the line is, and this will be subjective for all of us. Having spent my entire career in visual storytelling and image crafting - personally I'm fine with it.
Ultimately, this will come down to our own individual value judgements. What is a photo? At which point does a photo stop being a photo, and why? I'm super keen to hear your thoughts - let me know!
Business Developer chez Seelab.ai | J’aide les entreprises et les équipes créatives à gagner du temps sur leur création visuelle grace à l’intelligence artificielle pour booster leur communication et leur productivité
3 个月Seelab.ai is the ultimate tool for creating custom headshots. The AI does all the heavy lifting, delivering quick, precise, and high-quality results. If you haven’t tried it yet, you're missing out!
Creativity, Filmmaking, Strategy, Digital, UX, Design, Brand & Media
1 年Dan- Feel free to message, I have an idea for this...
Great article! As a token of appreciation here's a 10$ value pack FREE! PROMOX10 Enjoy making your very own unique and affordable AI headshots with Snapwiz.io
useless on linkedin
1 年omg ai dan is so hawt x