Exploring AI in Filmmaking: How I Elevated Visual Storytelling in a Music Video

Exploring AI in Filmmaking: How I Elevated Visual Storytelling in a Music Video

Hi all, Brandon here????!

Just finished working on a fun project - 'WUH', a music video for a friend and artist, ANDY. I blended Post VFX apps and AI workflows for this project and I am excited to share how this mix not only made the creative process more fun but also enhanced turnaround times and visual exploration.

For those exploring AI in filmmaking, I hope my experience proves useful!


Themes

The video for 'WUH' is built around three core themes: Mind-bending, Soulful, and Enigmatic. Andy and I developed these after discussing the mood and atmosphere we wanted the video to convey:


Mind-bending scenes are meant to give off an unconventional, otherworldly feeling - these shots leaned heavily on visual effects.

Mind-bending scenes are meant to give off an unconventional, otherworldly feeling - these shots leaned heavily on visual effects.

Soulful scenes tapped into emotional, sentimental and stirring visuals. ?These were shot all in camera and have little post-work done outside of colour grading.

Soulful scenes tapped into emotional, sentimental and stirring visuals. ?These were shot all in camera and have little post-work done outside of colour grading.

Enigmatic scenes lean into the mysterious, esoteric vibes. ?Not quite otherworldly as Mind-bending but not as sober as Soulful.

Enigmatic scenes lean into the mysterious, esoteric vibes. ?Not quite otherworldly as Mind-bending but not as sober as Soulful.

VFX & AI?

I filmed 'WUH' using a Sony FX3 and an FE 24-70mm F2.8 GM II lens, relying on natural lighting and a reflector, with all scenes shot outdoors without a green screen.

In post-production, I used After Effects and Premiere for editing and compositing, and Cinema 4D for a shot requiring 3D assets. These are all industry-standard apps that were crucial for assembling, editing, and colour grading the video in Premiere, and integrating visual effects and 3D assets in After Effects.

For the AI workflow, I utilized Automatic 1111 (a web-based interface for Stable Diffusion), RunwayML (a suite of machine learning tools for video effects), and Photoshop’s generative fill, (think Content Aware Fill but with AI instead).

We're at a pivotal moment with AI and Machine Learning tools, particularly in how they intersect with the arts. It's a complex and evolving topic, and I definitely do not have all the answers. However, I do believe that artists risk missing out if they don't at least explore these tools and consider integrating them into their workflow at some level. This technology isn't just a trend, it's becoming embedded in every facet of our lives, and who better to help influence its course and ensure its thoughtful integration than those deeply passionate about the arts?

Blending the worlds of AI and traditional techniques for 'WUH' was a familiar yet intriguing journey. Every few years there's always a new and potentially disruptive method of creating art. Creating assets with generative tools wasn't just a matter of pushing a button, I still had to know my craft deeply. From using the camera and equipment effectively, deciding on the final shots, selecting locations, and lighting Andy, to directing his performance (which was a breeze, thanks to his talent), every step demanded know-how.

In post-production, the skills involved were just as crucial - assembling and editing footage, rotoscoping, compositing multiple layers, and colour grading. In this context, AI tools were more like a cherry on top, enhancing rather than dominating the visual effects.

I leaned into the inconsistent aesthetic of animated generative imagery. The flickering and glitchy nature of current animated AI generations amplified the mind-bending quality of certain shots. It allowed for seamless transitions, like morphing a beach scene into an arid desert or expanding frame boundaries to smoothly switch from one shot to another. However, the core of this process still relied on clear intent and execution of skills, as no AI tool can replace an artist's ability to envision and bring an idea to life.


Tools and Techniques

The main AI tools I used were Photoshop’s Generative Fill for scene extension, Automatic 1111/Stable Diffusion for creation of models and then applying those model styles to frames of the footage captured (via a process called ‘img2img’), After Effects’ Roto Brush for isolating either Andy or other elements and RunwayML for background removal. I used RunwayML at the beginning of the post-production but transitioned to AE’s Roto Brush because of the AI advancements made with Roto Brush 2 and 3. ?


Photoshop, Generative Fill:?

This is pretty new and a lot of fun to play around with. ?In short, think of using content aware fill but backed by AI - I still find that Content Aware works wonderfully for colours, gradients or less complex imagery. ?

Generative Fill was perfect for these shots:

Andy in Desert:

The original shot had Andy on a beach. ?It was a great shot, but I found that there were too many shots of him on this particular beach and wanted to add another location that wasn’t a forest or city. ?The locales in the video are meant to be varied and conflict with each other. This was done to reflect Andy’s themes in his songwriting, specifically internal conflict as it relates to expressing a devil-may-care attitude on the outside but conflicting with that POV on the inside.?

Generative Fill was perfect here to place Andy in another locale. ?I first choose one frame from the scene, removed Andy from the scene in After Effects with the Roto Brush tool, took that frame into Photoshop, used generative fill to create a clean plate and then ran generative fill to change the locale from a beach to a desert. ?I then expanded the image to make it larger so I could zoom in and out if needed and have the extra coverage. ?From there I took the new frame/background back into After Effects where I then placed the rotoscoped Andy into the new location. ?I tracked the camera movement of the original footage, made everything a 3D layer, imported the camera, did some light colour correction and then Andy was in a desert!?

Generative fill

Generative fill


Andy from parking lot to forest transition:

I wanted to find a way to link two scenes together via a transition from the city to the forest. ?The parking lot and forest shot’s bounds were 3840 x 2160 but I needed them to be longer vertically so I could transition the two scenes seamlessly. ?

Extremely resized version of the clean plate I used for the transition.

Extremely resized version of the clean plate I used for the transition.

Using Generative fill here allowed me to expand beyond the bounds of the scene(s). ?Much like the Desert scene, I removed Andy from the parking lot shot, then I used photoshop to create a clean plate that did not have Andy in it. ?From there, I expanded the clean plate vertically, using generative fill. ?For the forest, I did the same thing as the parking lot shot, created the clean plate and then extended vertically. ?Once I had those two to work with, it was a matter of taking the clean plates back into After Effects. ?In AE, I blended the two shots using a 3D camera and some lens blur to achieve the effect.


Automatic 1111/Stable Diffusion:?

This tool has been a blast to play around with. ?There’s a lot you can do in this space, but I primarily used it to create models based on Andy’s likeness and then applied those models to an image sequence of the shots I needed using a feature called img2img. ?Img2img allows you to take an image and tell the AI to change it according to your text instructions. It's like telling an artist to take your picture and redraw it with some changes you want, like making the sky more dramatic or changing the colours of your clothes. The AI uses its understanding of how images are usually edited or drawn (a process known as diffusion) to make these changes.

An early test in Automatic 1111: The idea here was to see how a model trained on my likeness could be used to replace the face of a Metahuman created in Unreal Engine. Had trouble with consistency beyond a few frames, so this method was scrapped.

An early test in Automatic 1111: The idea here was to see how a model trained on my likeness could be used to replace the face of a Metahuman created in Unreal Engine. Had trouble with consistency beyond a few frames, so this method was scrapped.

Another early test, this time using a model trained on Andy on a Metahuman from Unreal Engine.

Another early test, this time using a model trained on Andy on a Metahuman from Unreal Engine.

I decided to use the img2img output in a way that would once again tie back to the themes of the song - the inconsistent nature in which we say we’re being ourselves but in reality, are not. ?Raw image output from batch img2img (animated image sequence) can sometimes be inconsistent and often results in conflicting styles over x number of frames. ?I decided to lean into that inconsistency and have different exaggerated versions of Andy fighting for the spotlight/control of the scene - the way they show up are in really glitchy, inconsistent and mind-bending ways.


I also used generative fill to explore different backgrounds.

I also used generative fill to explore different backgrounds.

This was a mix of Stable Diffusion on Andy and Generative Fill on backgrounds.

This was a mix of Stable Diffusion on Andy and Generative Fill on backgrounds.

Leaning into the glitchy inconsistent effect on Andy composited into the shot.

Leaning into the glitchy inconsistent effect on Andy composited into the shot.

Challenges

Using these tools, wasn’t all easy. In some cases where the results from Stable Diffusion just didn’t have the fidelity I wanted, I found myself opting for tried-and-true techniques.?

One particular shot called for several giant Andy heads in the background. Originally, I was going to do this with Stable Diffusion, but the results just looked really off. I felt they called too much attention to the effect versus Andy’s performance and style, so I opted to do it all in Cinema 4D and After Effects (if you'd like a breakdown of this one, let me know in the comments).


Early tests, this was just too far off from the intended look.

Early tests, this was just too far off from the intended look.

Early tests, this too just felt like using Stable Diffusion in this shot did nothing for the scene.

Early tests, this too just felt like using Stable Diffusion in this shot did nothing for the scene.

The C4D + AE combo was the way to go for the shot in this case.

The C4D + AE combo was the way to go for the shot in this case.

You also need a beast of a machine to play around in this space. ?My specs: GeForce RTX 3080 10GB VRAM, Ryzen 9 5950X 16 Core Processor, 98 GB of ram - and I had trouble with setting up and getting things to run stable (no pun intended) consistently, but the challenge of staying on top of what was new and learning how to navigate in this space kept me going.


Conclusion

WUH is the first music video I’ve created for someone else in a while, and I was excited to deliver something that was exciting to watch while inviting the viewer to learn more about Andy and his music. ?It was a ton of fun planning everything out, picking up my camera, shooting, editing, and making this video come to life. ?

It also taught me that no matter how much we advance and create new software to make things easier to do, you still need to have a POV and enough interest and love in the craft to see it all through to the end. ?I admit, it is somewhat overwhelming how fast all of this tech is moving, but I feel that we can and should understand this space and use it in tandem with our current talents to make things that we’ve previously only been able to dream about. I?would love to hear ways you think this tech could be used for advancements in the arts - how you would use it in a project?


Click here for links to Andy's social's and music.

Here’s where you can find me on socials, I'll be doing more deep dives there, reach out and say hi ????:

Instagram

Threads


Here are some resources, guides and inspiration that helped to get me up and running with Stable Diffusion:

Olivio Sarikas

Aitrepreneur

MDMZ

enigmatic_e

Karen X. Cheng

Thanks, y’all, looking forward to talking about the next project!?

Juan Colon

Senior CG Generalist at Method Studios

1 年

See, now that was dope. Thank you for getting that deep into showing how you worked AI into your vision of the music video. I'm loving the freestyle aspect of using some of those AI outputs to portray the picture the lyrics paint. Watching it without sound gave the images an almost pitter-patter "sound" that made me think of snares! Good to see you still goin hard!

回复
Zoraida Ingles

Digital Designer

1 年

Amazing work! You’re a creative genius for how you blend traditional filmmaking with AI. ??

要查看或添加评论,请登录

社区洞察

其他会员也浏览了