TCB Sidebar: Two AI tools that go beyond the prompt box
Yes, TCB is a biweekly newsletter. But now and then, I’ll do a “sidebar edition” that focuses on a particular topic. As always, let me know if you find this useful and if you have specific topics or trends you’re curious about.
In this sidebar edition, I’ll look at two generative AI imaging tools that have a common denominator: instead of starting with a blank slate and a prompt box, you provide some raw material—a snippet of video or an image—to guide the AI’s efforts.
To be clear, this isn’t unique among the two tools I’ll spotlight here. Most gen AI imaging tools offer at least some degree of “start with this image and go from there” functionality. But the tools discussed here put interesting twists on the concept.
Gen-1: Video styling
Runway is an up-and-coming cloud-based video platform with some impressive features: collaborative in-browser video editing and a dizzying array of AI-driven features.
But no Runway feature has gotten more attention than Gen-1, which lets you upload video and then modify it in some amazing ways. Turn a running dog into a tiger or a person into an alien...or a weird look. Apply styles such as watercolor, sketch, or claymation, or make a video look like it was shot in a snowstorm. Take an untextured 3D animation and colorize it and add an environment.
Gen-1 has been in a closed beta program for a while, but as of this week, it’s available as a beta open to anyone with a free Runway account. Clip duration is limited to five seconds, and after creating a dozen or so clips, you’ll need to upgrade to a paid plan (which starts at $12 per month if paid annually).
My take: Getting good results in the beta version of Gen-1 requires a fair amount of experimentation and tweaking. (And for the record, I don’t consider my examples above good.)?Still, Gen-1 is addictive, intriguing, and points to a future where generated AI video is just another part of a producer’s toolbox. Even in itss current form, Gen-1 is worth trying for video projects that can benefit from its quirky, edgy look.
What’s next: Runway’s Gen-2, in private testing now, promises to take AI video to the next level, providing similar stylizing features but adding additional capabilities: text to video (“a carousel at an amusement park”), image to video (putting people in a photo in motion, for example), and more. Runway‘s explainer, well, explains it.
ControlNet: Guiding Stable Diffusion
Stable Diffusion is to AI imaging tools what Linux is to operating systems: geeky, open source, and infinitely extensible. And unlike DALL-E, Midjourney, or Adobe Firefly, you can run Stable Diffusion locally on a sufficiently powerful PC, enabling much faster renders with no subscription fees.
领英推荐
Stable Diffusion’s open source nature has led to an explosion of add-ons that create video, that create images in very specific styles, that provide sophisticated in-painting, and much more. (Replicate.com is a great way to sample and run them.)
One particularly powerful add-on is ControlNet. It enables you to guide Stable Diffusion by providing an image that you want the AI to use as a guide. Want a specific composition, shape, or color palette? Create a quick sketch, then write your text prompt.
While playing with ControlNet, I tried an experiment. I uploaded a photo of a paper cutout that I created waaay back in first grade. My accompanying text prompt: a fanciful, colorful bird in a jungle, 3D render, unreal engine, computer illustration.
Here’s my first-grade artwork and Stable Diffusion’s result.
Instead of letting Stable Diffusion use its own pseudo-imagination to design the shape of my fanciful bird, ControlNet stepped in and used my source image as a guide.
I wonder what my first grade self would have thought.
Why it matters: Features like these provide the AI equivalent of providing a rough sketch to an artist or designer. As such, they’re critical for creatives who want real control over the output of an AI imaging tool.
The trend: beyond the prompt box
As I noted up front, the trend here is that we’re rapidly moving beyond the simple prompt box. Prompts still matter, but AI imaging tools are gaining features that enable creatives to more closely collaborate with an AI.
Rather than telling an AI, “Here’s a prompt; let’s see what you come up with,” we can say, “I want the results to be like this; here’s the prompt that describes the details.”
More control, more collaboration, more creative options. What’s not to like?
I’m a senior content manager for the Creative library at LinkedIn Learning, responsible for planning courses on graphic design, video production, art and illustration, and related topics. The Creative Brief is my biweekly(ish) newsletter spotlighting news, trends, and data points of interest to creative professionals. Please consider sharing it, and thanks for reading!
Content Manager at LinkedIn | Learning & Development, Education/Instructional Design, HR and Career
1 年I love this. Your example with your childhood drawing reminds me of the kindergarten teachers who took her student's drawings and created 3d stuffed animals with it. Imagine taking something such as this and not only bringing their art to life in new ways but also animating it with a few clicks. Thank you Jim!