AI Video: Generative Tools Guide
This image was generated in Midjourney, modified in LinkedIn using Microsoft Designer's generator here

AI Video: Generative Tools Guide

Comparison Guide & Analysis for Worldbuilding, Design, Production & XR Spatial Immersives (2024 Q1 edit)

The generative video landscape is booming thanks to big news and releases from OpenAI, Google and others below. There are over a dozen affordable tools out there that anyone can use to make their own videos at home for bigger screens or social sharing. Most of these tools are being deployed for shorts, music videos, advertising and in some cases, cinematic releases. These tools are also being used for deepfakes so please be wary and only engage these tools with the consent of the people you are working with (do not use other people's likenesses or footage without their expressed consent when you are testing for yourself).

This is a Q1 2024 #AITools comparison guide that will be updated as new tools are made available to the public. Safety and risk notes are also added below.

Please note - this is a top tools list and not an exhaustive list of every generative video tool or suite available across language sets and models. I will update as I have an opportunity to test and try more tools coming out of beta. Please leave a comment or drop me a private message with links if you'd like me to beta test your upcoming releases for video and media creators. I teach and test these tools with my band and www.youtube.com/@AuriclesAI has some of our experiments.

Kaiber: A quick go-to solution for abstract beauty

Benefits: Makes longer audio-reactive videos from an image or prompt or can transform video into long clips with multiple scenes and moving shots, has 2 generators and versions to make abstract or artistic videos (see our @AuriclesAI music video here, double processed through Kaiber then Topaz)

Drawbacks: Less realistic than upcoming models where realism is achieved through other means; requires credits that can be costly for complex productions where iteration or human realism is helpful for nailing specific shots and scenes

Runway Gen2 - Runway's latest version for realism and motion

Benefits: Multiple generators and versions for transforming existing video to a new style or generating clips on a text prompt or image with some degree of realistic movement through motion brush and painting in motion areas (Gen2 features)

Here's an example of video I made in Runway using motion brush in Gen2:

Drawbacks: Can be expensive for added credits when producing complex works with many scenes to match other media content in a published work; expect to spend at least $40-100 per month if making more than a few seconds of video.

SORA - Coming soon from OpenAI, diffusion realism

Reserving judgment on both benefits and drawbacks until public, SORA promises to be closer to a worldbuilding toolkit with video publishing capacity if early previews persist to the released product.

Premiere & Adobe Creative Suite, already used in your workplace

Benefits: Built into tools that many creators already use, especially at work. The creative tools for generative fill are quite good and able to anticipate a number of style and size/scale challenges other generative tools are unable to meet.

Drawbacks: Paying for Creative Suite can be pricey and a full account with payment on record is required for any significant use of their generative tools.

D-ID: the lip sync & deep fakers early solution for ease of use

Benefits: Makes lip sync clips much easier to make than most existing solutions and is baked into other solutions like Canva (see below). It was one of the first but isn't necessarily the best solution out there currently to match lip movements.

Drawbacks: The lip sync may not always match the words you'd expect in that language and dialect. Non-human characters are allowed up to a point but rarely look realistic enough for most uses. More realistic tools are now available and more accessible or free tools are also available.

Pika Labs: a great go-to choice to get started with generation

Benefits: Has new features like lip sync and sound effects for pro users; some of those features work better than others. Realism is somewhat realistic; see my tests here. I found the Barbies on the Red Carpet most compelling, the first scene was nothing close to my prompt. Prompt matching the output is a common issue in most generators and is not specific to this tool; realistic and concrete ideas are easier to generate from a photo rather than direct from prompt.

Drawbacks: Discord use required for generation in early use, now can be prompted via web but not all features are available unless you are a pro user paying $58 a month to use all of their professional services.

Augie (AugXLabs) - combines multimodal writing, VO, clips & edit

Benefits: Makes it relatively simple to make a basic generated video with licensed clips that are already cleared for social media uses. Good for basic storytelling, ok for teachers who want to add to their resources and likely ok for those who have a rental or sales offering and need to make videos quickly. Uses GIFs and video from verified licensed sources like GettyImages.

Drawbacks: Can be a bit cumbersome to edit and sort out how to modify videos effectively before publishing, needs additional UX implementation and hopefully cross-sector implementation with partners in the future. Voiceover is more subtle with other programs but is sourced through ElevenLabs, which offers professional services to those who want to make voice clones for their own audiobooks.

Wonder Dynamics: a VFX go-tool for the indie filmmaker

Benefits: Great to make non-human characters mapped to a human movement for videos that feature dancing characters or other human-like comedic effects. Note - I'm not as experienced so have included a video from an experienced user here.

Drawbacks: Not a typical video generator, this is more of a VFX toolkit for a specific type of replacement shot where a generative character replaces a human in a scene. Useful for some types of film and short video releases. Expensive to make larger film projects, albeit less expensive than other VFX workflows. "We are paying Wonder Dynamics to learn how to use its platform" - Brenda Blanco in video above reviewing the cost of processing and promptcraft

Pictory: the quick and easy production choice for many companies

Benefits: A top to bottom video production solution similar to that within Augie, Canva, Adobe (Firefly) combining modalities for quick video production using a mix of partnered generative tools and solutions. More of a ratchet set in your toolbox than a specific tool for very specific jobs, this can get a few things done off your list quickly.

Drawbacks: Alternate editing solutions include CapCut AI, Canva and Augie may offer more realistic or lifelike solutions for those that need realism or more control over their content in the editing process.

Fulljourney

Benefits: Relatively easy and free to use to get started, but not great quality

Drawbacks: Discord use required for video generation, is not useful for lip sync yet and has limited capacity for realism at this version of its development.

Lumiere (Google's upcoming release)

Not yet in public release, this toolkit promises greater realism than can be currently achieved in most of the tools above other than SORA; lighting may be the leading edge here as the name suggests along with the quality of the team and source material available for training. However, Google has not always had great luck debuting new generative tools for the public (see Gemini). Lumiere has great promise from early reports but is not yet open to the rest of us for testing.

NVIDIA AI: Picasso, ACE, SD Video + others

NVIDIA has a suite of solutions for professionals including both open source solutions and higher cost cloud and server support for film and television effects from historic recreation to relighting, avatar replacement and restyled video clips for cinematic quality blending within professional workflows

https://catalog.ngc.nvidia.com/orgs/nvidia/collections/nvidiaai

https://www.nvidia.com/en-us/gpu-cloud/picasso/

Maxine Live Portrait: https://catalog.ngc.nvidia.com/orgs/nvidia/teams/ai-foundation/models/live-portrait

ComfyUI/AnimateDiff (Stable Video Diffusion)

Benefits: able to be locally installed and runs a mostly OS workflow

Drawbacks: quality, requires advanced skills for best use, not for beginners or non-technical creatives

Canva (with Runway & D-ID inside)

Benefits: Easy to edit & integrate into videos, reels and publish to social streams and takes a lot of time out of production for simple reels and short videos, published needs that include multimedia and multimodal prompting. Generative video inside Canva is not great (Runway enabled) but is better than blank backgrounds. Magic Studio is somewhat useful, the D-ID integration works somewhat better than video generation and text generation is quite strong.

Drawbacks: Requires a monthly paid subscription to integrate most assets into published works starting around $15/month for access to apps and release tools

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

https://humanaigc.github.io/emote-portrait-alive/

This tool is one I'm currently tracking from China that looks promising for lip sync in a variety of use cases and settings. Whether it matches what we're getting out of SORA, NVIDIA or LUMIERE remains to be seen.

Producers note for post: I almost always use the video tools from Topaz Labs on my generative videos for cleanup and sometimes to upscale or improve the frame rate as many of these tools only generate 8 to 12 frames per second. This can add to artistic effects if you're experimental and willing to take time to learn those tools. Budget the $$$ for post production and mastering tools using AI; they're worth it.


That's a LOT of tools. So which generative video tool is the best?

For realism, keep an eye out for SORA and Lumiere's next releases. Pika is a great place to be making video clips today along with Runway Gen2 in motion brush.

Runway works with NVIDIA so if you're a video production professional, watch the GTC talks next week about their new AI releases. NVIDIA and Adobe tools are fantastic overall and may already be used by your company on other projects.

Other tools in beta or now available to the public (see above) are great for editing, VFX, abstract art and style transfers. I personally use Kaiber and Runway more than any other tools for art and concepts. I send teachers and parents to Pika, Augie or Pictory if they're not tech saavy or already users of Adobe products. I'm using Canva on the regular for production but find it uneven for generative output.

A Note on Safety and Risk related to AI and Video

Risk and safety remain concerns not only within the generative media space but more broadly with the public and with governments. Deep fakes being used for public manipulation are a common concern along with a rise in hacking and cyberattacks. Teachers are constantly concerned about use in classrooms.

Note that the EU has released their AI regulations today awaiting final vote:

"One big “win” for civil society was the Fundamental Rights Impact Assessments (FRIAs) – there will be an obligation for high-risk AI deployers to conduct these assessments. But – and it’s a big ‘but’ – the FRIAs do not always include the private sector, so only those deploying AI in the public sector and a narrow subset of private companies will have to assess the risk to human rights – leaving many people unprotected." - Laura Lazaro Cabrera

How we choose to build this industry through thoughtful feedback is up to all of us. What tools interest you, and what are the biggest questions you have regarding their use and safe deployment with the public?


Roger Rangel

Generative AI Expert | Creative Technologist | Software Engineer | Founder @Vibeverse.

8 个月

Amazing, as the CEO of a startup that connects influencers with AI, this guide is priceless ?????? Very interesting ??

David Cronshaw

Sr. Product Manager @Disney Streaming | Co-Founder Chatmosa chatmosa.bsky.social | AI, Generative AI | Revenue Generation | Former Microsoft and T-Mobile | Co-Founder UltimateTV.com - Zap2it.com

8 个月

Evo, very informative! Thank you.

David Eliason

VR/AR/XR Artist

8 个月

Excellent rundown, Evo! Thanks!

Evo H.

Media/Technology Strategy & Workflows Producer, Showrunner, Artist & Author Immersive, Interactive, Spatial & Generative

8 个月

Next up - 3D world builders!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了