Less is More in Image Prompting

Less is More in Image Prompting

Explorations around the overly wordy prompts and refining things to be more precise, to get what you want.

I saw this following post on LinkedIn, that was also shared by 2 connections (which is how I came across the post). It got me to thinking and I felt for clarity’s sake, that this needed to be explored and addressed.

I want to start with some initial statements:

1 ? I understand that Ismail is trying out long prompts in Flux — which is another genAI text-to-image platform — and not Midjourney.

2 ? In the proliferation of words and fantastic looking images, most people will not dig deeper — so not recognize that this actually is not working.

3 ? This deeper dive is not attacking the post, the poster or re-sharers, rather I am intent on lending some clarity.


I made this comment on the post — and I share it here because this is the crux of the issue.

What’s the point?

The importance in prompting is to find the line between describing what you want, avoiding the unimpactful terms, and staying safely in the realm of tokenization where the words adhered to and used are not left out of translation. We need to be more succinct and pithy in our prompts for better results.

My friend Adrian Elton noticed the same thing I did… this really long prompt is not being adhered to from requests made in the very first line.

I went to ChatGPT-4o -

Using my 12 Elements of Prompting framework, modify the following description into a condensed and pithy prompt.

I followed this with the full copy-paste of the long-form prompt by Ismael, to which ChatGPT replied:

my ChatGPT conversation

I took this condensed prompt format to Midjourney, adding an aspect ratio and style raw.

--ar 5:3
--ar


I modified the prompting using ChatGPT-4o

The key to this, is it is NOT the broad expanse of many words that will give you the most success in prompting and renders, but rather a refined approach based on succinctly stating what it is you want.

Charles Dickens was paid by the word in his day, so extra words were useful for his output. In prompting though, we want to be as succinct as possible for the best results, so being more succinct and pithy means that truly, less IS more.


In the ChatGPT conversation, I asked ChatGPT to create an image using the prompt. It often modifies its own prompt before rendering with Dall-E. Here is what IT produced:

Create an ultra-realistic, abstract, symmetric long shot of a super futuristic residential building facade merging ornate Baroque and bold Brutalist architecture. Highlight intricate Baroque curves and elaborate ornamentation seamlessly integrated with powerful Brutalist geometry. Use atmospheric, diffused lighting to enhance reflective and refractive qualities, making the facade appear almost alive. Emulate a 35mm film photograph with a subtle grain and dreamlike warmth, introducing a timeless, cinematic quality. Challenge the viewer’s sense of scale and perspective in a hypnotic visual experience, showcasing architectural innovation for a futuristic world. Portrait format.
Dall-E render

When given a lot of words, AI decides which it will pay attention to. The reason, is due to tokenization, interpretation and application. These are some some core aspects of how AI processes text inputs, especially in the context of generating images or performing other complex tasks.

AI models like Flux (the one Ismail interacted with), Midjourney, and even Dall-E here break down the input text into smaller units called tokens, which can be words or parts of words. These tokens help the model understand and generate responses. The model has a maximum limit of tokens it can consider at one time, which can affect how it handles lengthy inputs.

Here’s a simplified breakdown:

  1. Tokenization: The model splits the text into manageable pieces (tokens). If the text exceeds the model’s token limit, it may not consider some parts of the input.
  2. Interpretation: The AI interprets the tokens based on the training it has received on vast amounts of text data. It tries to understand the context and the specific task it needs to perform.
  3. Application: The model applies what it has interpreted to generate a response or create something new, like an image. In doing so, it prioritizes certain tokens over others based on their perceived relevance to the task.

When you provide a lot of detailed information, as in a lengthy prompt for an image, the AI aims to capture the essence of the request but might focus on certain aspects more than others. This isn't just a limitation but also a method of ensuring that the most crucial elements (as understood by the AI) are addressed.

So while there's a practical limit to what the AI can process at once, it's also about how the AI interprets and prioritizes information within the context of its capabilities and the specific task at hand.

Chet Moss

Lover of all things creative

1 个月

I'm finding less is more valuable everywhere these days, Brian. Less media chatter, less self-indulgence, less stress etc. In a one-word prompt: / less.

Michael Uman

Creative Director | Story Telling | Branding | Art Direction | Graphic Design | Prompt Architect | Midjourney and AI explorer

1 个月

I try to strike a balance between longer and shorter prompts. i always start short. I often go to POE and have it make the prompt longer. I edit that prompt for unnecessary conent . There is usually a difference in detail and results in the longer prompt sometime for the better and sometimes not. Ny longer prompts don't get nearly as long as Ismaels prompt.

James Larkin

AI | Midjourney | Runwayml

1 个月

Yeah I agree, it's rare that a long prompt gives consistent different results

Bill Snebold

Creative Director, Motion Designer, AI Enthusiast/Influencer/Educator

1 个月

Brian, I had the same reaction you did upon seeing that post. I also thought that the results weren’t very interesting given the vast amount of words used. Hey, as an experiment I took your original list of parameters and pasted that into Midjourney (I edited in the —no parameter though). Here’s the results.

  • 该图片无替代文字
Aritra Mukherjee

Helping Busy Founders & Coaches Get More Clients | LinkedIn Ghostwriter | 20+ Happy Clients | Ex-Organic Buzz

1 个月

Absolutely loved your insight on the impact of word count in prompts! Brian Sykes One key aspect often missed is the importance of context within those words. Even with a high word count, if the context isn’t precise, the results can still be off-target. Always aim for clarity in context, not just quantity.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了