The Un-Guide to Midjourney
Magnus Dahl
Communicator at The Swedish Institute for Social Research. Brain for hire.
This is not a guide to Midjourney or any other generative AI tool. There are hundreds of tutorials and how-tos available, just a Google away. Go look at them if you want to learn about parameters, commands, and such.
No, this is just me, Magnus Dahl, trying to understand my creative process working with Midjourney. I am struggling to articulate my thoughts on "prompting" (a horrible word) and "prompt engineering" (an awful expression), and often I think the best when I'm writing.
Let's start with the horrible words. A "prompt" is a text written by a human, given to an AI in the hope that it will return the desired output. When I, the human, write "Painting of a hortensia, American modernism" in the Midjourney input field, I hope the AI will give me an image that looks like a painting of a hortensia.
Maybe something like this:
Or perhaps like this?
Midjourney is a random image generator. A random image generator that you can steer in the direction you want, but still a random image generator.
A prompt is a wish disguised as a computer system command. It is a manifestation of human intent, offered up to a machine. The word "prompt" gives a false sense of control. The expression "prompt engineering" is even more devious, as it hints that generative AI use is a science. Something you can control with mechanical precision. It is not.?
Creativity, even machine-aided, is not about control. It is about empathy and dialogue.
“Prompt engineering” is an expression of the human desire for control. We have created a machine that can do amazing things, so we must control it. The goal of prompt engineering is to reduce the amount of chaos in the AI output and make it predictable. But the methods of control we have today are based on hearsay, rumors, and sales speech. No one– not even the creators of the tools– fully knows what words, phrases and strategies will actually work. Control is an illusion.
And it does not matter. Because creativity, even machine-aided, is not about control. It is about empathy and dialogue. It is about giving and taking and sharing.?
Working with Midjourney is an associative process, an exchange of words and images between a human and a machine. It is organic, chaotic, and often non-intuitive. A stream of consciousness that is hard to explain to others.
But I will try.
Start with an idea
First, there is an idea. The idea can be a word, a sentence, or a few paragraphs. It can be a feeling, a memory, or just an impulse to create something, anything.?
Here's an idea: a purple ladybug
Second, I write my idea into the Midjourney input field. When my fingers meet the keyboard, the idea changes. For me, this transition from thought to prompt is fascinating. It is not unique to Gen AI; it happens when I write anything with any tool. My thoughts change as I write them down.
The difference when using Midjourney, ChatGPT, or any other AI tool with prompt-based input is that the change is directly linked to my knowledge of how the generative models work. I try to fit my idea into a mold that I, probably incorrectly, believe is the best way to interact with the AI.
I often try to challenge myself to prompt in a way that is as far from "best practice" as possible, but this time I failed. I just wrote a basic boring prompt.
Cute. But what will happen if I use my first Hortensia image as a style reference with the ladybug prompt?
As the hortensia/ladybug renders, another idea suddenly comes to me: a supersonic blast in a clear sky.
Again, the words change a bit as I put them into Midjourney.
That is not how I imagined a supersonic blast, but ok. Now my hortensia/ladybug is done as well!
I like this one, even though it doesn't look like any 3D animation I have seen. But maybe I can mix it with the supersonic image somehow? That could be interesting. But how? Well, on a whim, I use the picture above as a character reference and the supersonic one as a style reference.
After three variations, it turns out like this:
Shiny! And boring. Let's rerun the prompt, add some –weird, and see what happens.?
领英推荐
The ladybug looks like it is made out of painted wood! What would a chair in the same style look like??
Let's find out.
I added a camera model just for fun. Fujica ST605 is a budget household camera from the 1970s. I keep a list of cameras from different eras to have something to work with. Sometimes, if you want your image to have a vibe of a specific period, it is easier to specify a camera model ubiquitous during that era than to use phrases like "in the style of the 1970s" or whatever. Sometimes, not always. Random image generator, remember?
I'm unsure how much the Fujica affected the result, but I like the chair image. Very cool floor and lovely lighting. The chair in itself, though, is perhaps the most unsafe-for-kids piece of furniture I have ever seen.
But – I have no use for an image of a chair. I am trying to generate some sort of cartoon ladybug!
Fast forward 4 weeks
Suddenly, I need a picture of a chair. I remember the ladybug chair and look it up in my Midjourney archive. It is close to what I'm looking for but not spot on. So I do some experiments with the chair-picture. I use it as a style reference, an image prompt, and a style reference again. I do a lot of variations. Remixes. I try some different prompts. I won't show you all of them here, but after a while, I get this:
Pretty nice. The hardest part was generating ok-looking legs.
Ladybug goes to space
So, what happened to the ladybug? I returned to it after a while, inspired by the loading screen from the 1983 Atari video game M.U.L.E., to make this picture:
Why did I mix the ladybug with the loading screen from a 41 year old Atari video game? I'm still trying to figure that out, but the idea came to me after a friend texted me about the game out of the blue.
I downloaded the image and opened it in Photoshop to remove those weird things in the sky and adjust the colors somewhat.?
Next, I uploaded the photoshopped image to Midjourney again, used it as a reference, and prompted away.
Finally – a spacefaring ladybug robot!
Frankenstein's prompting
As you can see, my process is a Frankenstein's monster. Ideas generating pictures generating ideas generating pictures… Everything is based on something else. Which, I guess, is the essence of generative AI??
Questions like "What prompt did you use to make this picture?" are mostly meaningless because, most of the time, an AI image is not the result of a single, easy-to-show prompt. Instead, they are the result of long, meandering, associative brainstorming sessions between humans and AI.
A prompt can actually be misleading. Take this image of a robot, for example. The final prompt was "Manga drawing of a mecha in combat":
While it is correct that the prompt that generated the image read that way, to truly understand the process, one must rewind almost two months.
When you look at someone's AI-generated artwork, it is essential to remember how much chance plays into the result. Using Midjourney as an artistic or creative tool is akin to action painting – where artists randomly throw paint on a canvas. In action painting, the artist chooses the paint, the canvas, and the location. The artist controls the setting, so to speak, but the end result is inherently random.
"Prompt engineering" is throwing paint over and over again until you get what you want. Frankly, it is not engineering at all, and we should stop calling it that.?
It’s not engineering, it's not math, it’s not coding, it’s not mechanics.
Let's call it what it is: creativity.
Pr?sentationsdesign | KI für Powerpoint | Visual Storytelling | Starke Pr?sentationen: für alle F?lle, mit allen Mitteln.
7 个月This is an extremely important insight for everyone working with Midjourney on a creative ticket. Thank you!
Brinner f?r AI, digital video, content marketing och datadrivet - strategi och taktik
7 个月Interesting!
CEO CoreCortex & Behavioural Economy
7 个月"Because creativity, even machine-aided, is not about control. It is about empathy and dialogue. It is about giving and taking and sharing." -- control is an illusion ?? Thanks Magnus Dahl
??????
Transformational leader, Product Owner and Business developer at TeamTech
7 个月Awesome, like peering into your brain or creative process.