You May Hate ChatGPT, But You'll Love DALL-E
Joe Lazer (Lazauskas)
Best-Selling Author of The Storytelling Edge, Fractional Head of Content @ A.Team
When it comes to visual art, I've got no game. ?
In kindergarten, I was constantly scolded for failing to color in-between the lines. Confidence wise, it was all downhill from there. Sophomore year of high school, I got a C+ in photography—a slam dunk elective that was supposed to compensate for the fact that I quiet-quit calculus. More recently, my chicken-scratch content strategy diagrams have been traumatic enough to warrant a class-action lawsuit from the designers I work with. I can write a book, but I can’t draw a straight line.?
That makes me the target audience for DALL-E 2 and Midjourney—the two most popular image-generating tools built off OpenAI’s GPT-3 technology—that promise to let you MAKE VISUAL ART WITH WORDS. If you read this newsletter, there’s a good chance that you are too.
If you haven’t tried DALL-E or Midjourney yet, it’s understandable. Our feeds are flooded by the endless lists of tools peddled by AI bros who claim to be “award-winning 10x growth experts” but don't seem to have an actual job. It’s easy to think the whole thing is crypto 2.0, an opportunistic scam hyped by the worst people you went to high school with.?
And listen: you should be skeptical. As I wrote two weeks ago, ChatGPT lies and hallucinates. The new Bing, which attempts to fuse ChatGPT with traditional search, has become an unhinged sci-fi character. You shouldn’t use AI as a replacement for search or human writers.
But you also shouldn't dismiss them. AI tools are also really good at things that don’t involve total factual accuracy. They’re amazing at creating images, brainstorming ideas, and rewriting copy—plus automating a whole host of boring tasks. It’s going to inevitably change our jobs as marketers and creators the same way search and social did, and if used well, these tools have the potential to turn us into super creators.
Think of these next few newsletters as a guide to making the most of AI tools — for skeptics, from a skeptic. A thoughtful breakdown of specific use cases, instead of a never-ending list.?
First up: digital image generation, using tools like DALL-E and Midjourney.?
Midjourney is like some sketchy spell Hermione cooked up to save Ron from failing art class … or to save you from lame hero images
Being a writer who can’t design sucks. You can write the most brilliant piece, but then you’re stuck searching stock photo sites for an image that inevitably 50,000 people have used before, hating yourself. You’ll feel no greater disdain than when you press publish on this bad boy:
You know you’ve used it.
DALL-E and Midjourney are the two most popular image-generating AIs. DALL-E runs on OpenAI’s GPT-3 large language model, like ChatGPT. Midjourney is an independent research project that runs off a Discord server. They’re both free to start using and basically do the same thing—make amazing images from text—with special quirks.
For instance, Midjourney is, for whatever reason, way better at Harry Potter content. So I asked it to make an image of the H2 above (“photo realistic Harry Potter. Hermione casting a spell on Ron”) to really illustrate my point.
To use them well, you need to master “prompt engineering”—which sounds lame but is really important??
We’ve been using Midjourney and DALL-E to create original images at A.Team for the past 6 months, and it’s been a huge help in creating images for our blog and social channels.??
The key to using Midjourney and DALL-E well is “prompt engineering.” If you’re rolling your eyes and gagging after reading that term, I get it. It sounds like BS. Really, you’re not engineering anything. You’re just bossing the machine around with the swagger of Miranda Presley.
These tools thrive on specificity. Say you’re working for Miranda’s favorite designer and generating some fashion inspiration. Ask DALL-E for an image of a “members-only jacket” and you get this, which is simultaneously bland and unnerving.
But give DALL-E a more specific prompt of ““Jackson Pollack. Members only jacket. Converse sneakers. product photography” and you get this:
Way more dope. Miranda will be pleased.?
Not surprisingly, our creative director Brad is way better at using DALL-E and Midjourney than me. That’s because he knows how to creative-direct the AI. Meanwhile, I’m throwing around terms like “low-angle, telephoto lens, blue hour” like I know what I’m talking about but —again—nearly failed high school photography.?
The key to using it well is specificity. Say I need to generate a still for the poster mock-up of the new Air Bud: Crypto Canine script I’m working on—a heart-wrenching film in which Air Bud retires from basketball to launch a crypto ponzi scheme, develops a microdosing regiment, and flies too close to the sun.
领英推荐
Ask DALL-E “crypto golden retriever” and you get this—a good boy but not what we’re going for.?
But ask for “Film still. Gangster golden retriever in sunglasses wearing a gold chain, sitting on a pile of cash” and you get this:
Not perfect, but definitely more of the Air Bud meets SBF vibe I was going for.
The best resource I’ve found is Guy Parsons’ excellent DALL-E 2 prompt guide and this Midjourney guide from Lars Nielsen. Their advice basically boils down to being specific and iterating. That means using prompt language like:
Or as our creative director Brad advised: “Be very specific with elements you can control. For example you could give it a general color palette, or hex code, or define color for every element: Background, skin tone, etc.”
Really, it’s a lot of trial and error. Prompt engineering takes work — the output is only as good as your creativity and ability to give direction. That’s a nuance a lot of people miss when they talk about these tools replacing creatives instead of supplementing them.?
Ultimately, the process is super fun and way better than using a stock photo ever again. Or writing a newsletter for two years and getting 130K subscribers while using a lo-res screenshot of your book as the logo. But thanks to Midjourney, I was finally able to make one that’s ridiculously over the top:
What are they bad at??
Both tools really struggle with text. Give it any sort of text and it’ll spit out random characters that always seem to be stylized like Russian propaganda.?
Realistic faces are often tough and require several iterations. Midjourney can work well off an existing photo, which is super cool and also a privacy nightmare.?
They also struggle with realistic props like a stack of cash (see the Air Bud pic above) although Midjourney seems to handle it a bit better.?
If you’re still reading at this point, I’ve likely worn you down. So which one should you start with?
They’re both free, so I’d recommend both. Start with DALL-E because it’s easier to use, but also head to Midjourney because you’ll learn a lot more.
You just go to the DALL-E 2 site, login to OpenAI using your gmail, and then boom, you’re up and running. Use the Guy Parsons’ guide and start generating some prompts. You get 50 free credits to use to start, and then 15 free credits every month after that. You can pay for more and it’s not nearly as expensive as it should be.?
Midjourney is slightly more confusing, at least if you’re a lame AF mid-millennial like me who never uses Discord. You need to create a discord account, and then you give prompts to the Midjourney bot in a public Slack-style channel.?
The downside is that it’s sometimes hard to keep track of your requests; the upside is that you get to see what everyone else is creating. This helps you improve your prompts, and there are also channels where people share real tips. You also get the buzz of being in a weird, exciting community. As I write this, someone in my channel is creating pictures of Mr. Clean with a mullet:
Someone else is clearly creating a children’s book, one image at a time.?
And a dad is turning his son into a superhero … a heartwarming prompt that inspired me to do the same with a picture of my dog Quincy.
And if a gritty still of Quincy from The Dark Dog Rises doesn’t get you excited about this technology … I don’t know what will. Although Midjourney does need to learn that Quincy has four legs.?
Chief Creative Officer at SilverShore Studios
2 个月Is Dali an AI tool or App?
Marketing Lead | Content Manager | Copywriter | Editor | Consultant
1 年DALLE-2 has impressed me a lot more than the textual equivalents. I like what you say about a lot depending on prompt engineering. I wrote a piece here on using DALLE-2 as a starting point for book cover design ideas. https://www.nownovel.com/blog/book-cover-art-ideas/. It's not going to do better than a human book cover designer with understanding of layout, typography, genre trends and norms, etc. It's limited in its square format. Although it was sometimes in the mood for romance (and gave succulent-looking ostriches on couches) it missed the mark a lot too. But it understood branding trends such as 'mystery/suspense book covers = black and blue with fog'. I liked the inspirations it gave for a hypothetical edition of Great Expectations.
Editor @ Living Light Corporeal LLC | Universal EPA Certification
1 年My mind is expanding with all of the tools available that are not dependent upon my limited artistic abilities. But as for imagination - that is something being ignited almost daily - I try to avoid the use of anyone else’s style when creating my prompts:
Dot connector. Storyteller. Arts Advocate.
1 年Interesting article, thanks. I find myself on the fence re both AI artwork and chat. As a writer, I struggle with the notion that ChatGPT will be used to spit out generic stories/articles removing the human/heart elements of writing. I also struggle with seeing professional artist colleagues who have honed their crafts for years watching non-artists "create" prompt-driven artwork to be sold as fine art. While I see the merits for background art and editorial embellishments (when the STORY is the product,) and those who are choosing to explore their inner creative for fun, I know I'm not the only person who struggles with the ethical dilemma of the AI-as-fine-art notion. Would love to hear your thoughts on this.
--
1 年Great article! Love the gangsta Golden!