登录查看更多内容

Less is More in Image Prompting

Brian Sykes

A.I. Catalyst | AI Educator for Creative Professionals | LinkedIn Top 100 AI Artists | Brand Architect | Keynote Speaker

发布日期: 2024年8月19日

Explorations around the overly wordy prompts and refining things to be more precise, to get what you want.

I saw this following post on LinkedIn, that was also shared by 2 connections (which is how I came across the post). It got me to thinking and I felt for clarity’s sake, that this needed to be explored and addressed.

I want to start with some initial statements:

1 ? I understand that Ismail is trying out long prompts in Flux — which is another genAI text-to-image platform — and not Midjourney.

2 ? In the proliferation of words and fantastic looking images, most people will not dig deeper — so not recognize that this actually is not working.

3 ? This deeper dive is not attacking the post, the poster or re-sharers, rather I am intent on lending some clarity.

I made this comment on the post — and I share it here because this is the crux of the issue.

What’s the point?

The importance in prompting is to find the line between describing what you want, avoiding the unimpactful terms, and staying safely in the realm of tokenization where the words adhered to and used are not left out of translation. We need to be more succinct and pithy in our prompts for better results.

My friend Adrian Elton noticed the same thing I did… this really long prompt is not being adhered to from requests made in the very first line.

I went to ChatGPT-4o -

Using my 12 Elements of Prompting framework, modify the following description into a condensed and pithy prompt.

I followed this with the full copy-paste of the long-form prompt by Ismael, to which ChatGPT replied:

Tomasz Tunguz 11 个月前

I'm not gonna write you a newsletter...today…

Lori Mazor 3 个月前

GenAI perspectives (contents+ post): a year with my…

Uri Baruchin 4 个月前

I took this condensed prompt format to Midjourney, adding an aspect ratio and style raw.

I modified the prompting using ChatGPT-4o

The key to this, is it is NOT the broad expanse of many words that will give you the most success in prompting and renders, but rather a refined approach based on succinctly stating what it is you want.

Charles Dickens was paid by the word in his day, so extra words were useful for his output. In prompting though, we want to be as succinct as possible for the best results, so being more succinct and pithy means that truly, less IS more.

In the ChatGPT conversation, I asked ChatGPT to create an image using the prompt. It often modifies its own prompt before rendering with Dall-E. Here is what IT produced:

Create an ultra-realistic, abstract, symmetric long shot of a super futuristic residential building facade merging ornate Baroque and bold Brutalist architecture. Highlight intricate Baroque curves and elaborate ornamentation seamlessly integrated with powerful Brutalist geometry. Use atmospheric, diffused lighting to enhance reflective and refractive qualities, making the facade appear almost alive. Emulate a 35mm film photograph with a subtle grain and dreamlike warmth, introducing a timeless, cinematic quality. Challenge the viewer’s sense of scale and perspective in a hypnotic visual experience, showcasing architectural innovation for a futuristic world. Portrait format.

When given a lot of words, AI decides which it will pay attention to. The reason, is due to tokenization, interpretation and application. These are some some core aspects of how AI processes text inputs, especially in the context of generating images or performing other complex tasks.

AI models like Flux (the one Ismail interacted with), Midjourney, and even Dall-E here break down the input text into smaller units called tokens, which can be words or parts of words. These tokens help the model understand and generate responses. The model has a maximum limit of tokens it can consider at one time, which can affect how it handles lengthy inputs.

Here’s a simplified breakdown:

Tokenization: The model splits the text into manageable pieces (tokens). If the text exceeds the model’s token limit, it may not consider some parts of the input.
Interpretation: The AI interprets the tokens based on the training it has received on vast amounts of text data. It tries to understand the context and the specific task it needs to perform.
Application: The model applies what it has interpreted to generate a response or create something new, like an image. In doing so, it prioritizes certain tokens over others based on their perceived relevance to the task.

When you provide a lot of detailed information, as in a lengthy prompt for an image, the AI aims to capture the essence of the request but might focus on certain aspects more than others. This isn't just a limitation but also a method of ensuring that the most crucial elements (as understood by the AI) are addressed.

So while there's a practical limit to what the AI can process at once, it's also about how the AI interprets and prioritizes information within the context of its capabilities and the specific task at hand.

AI Explore

6,013 位关注者

Chet Moss

Lover of all things creative

1 个月

I'm finding less is more valuable everywhere these days, Brian. Less media chatter, less self-indulgence, less stress etc. In a one-word prompt: / less.

2 次回应

Michael Uman

1 个月

I try to strike a balance between longer and shorter prompts. i always start short. I often go to POE and have it make the prompt longer. I edit that prompt for unnecessary conent . There is usually a difference in detail and results in the longer prompt sometime for the better and sometimes not. Ny longer prompts don't get nearly as long as Ismaels prompt.

2 次回应

James Larkin

AI | Midjourney | Runwayml

1 个月

Yeah I agree, it's rare that a long prompt gives consistent different results

1 次回应

Bill Snebold

Creative Director, Motion Designer, AI Enthusiast/Influencer/Educator

1 个月

Brian, I had the same reaction you did upon seeing that post. I also thought that the results weren’t very interesting given the vast amount of words used. Hey, as an experiment I took your original list of parameters and pasted that into Midjourney (I edited in the —no parameter though). Here’s the results.

1 次回应

Aritra Mukherjee

Helping Busy Founders & Coaches Get More Clients | LinkedIn Ghostwriter | 20+ Happy Clients | Ex-Organic Buzz

1 个月

Absolutely loved your insight on the impact of word count in prompts! Brian Sykes One key aspect often missed is the importance of context within those words. Even with a high word count, if the context isn’t precise, the results can still be off-target. Always aim for clarity in context, not just quantity.

1 次回应

查看更多评论

要查看或添加评论，请登录

查看全部

Less is More in Image Prompting

Brian Sykes

A.I. Catalyst | AI Educator for Creative Professionals | LinkedIn Top 100 AI Artists | Brand Architect | Keynote Speaker

I want to start with some initial statements:

What’s the point?

领英推荐

I modified the prompting using ChatGPT-4o

AI Explore

6,013 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

This "Hidden" Language Trick Is Used by the World's Most Persuasive People

MarcoverseGPT Pre-Release ?? (Features, bug fixes and species patch)

I asked Chat GPT to interview me and was amused by the results

Who Really Wrote My Article? A Journey with ChatGPT (or Sparky, as I Call It)

?? Product Digest #002: Who Can Use Your Colors, GPT Detector, and Test Your Website from Different Countries

Interview with Bard (Google) about its role and how things will evolve in the future.

Prompt Libraries

Microsoft: "you're using Bing wrong! You're not supposed to talk with it!"

DIY Voice-Controlled GPT Chat: A Step-by-Step Guide for Beginners

THE SECOND WEEK of “THE TRIAD”….What are the 3 methods of becoming a #1 Best selling author in a long term way…answered by ChatGPT

I want to start with some initial statements:

What’s the point?

领英推荐

I modified the prompting using ChatGPT-4o

AI Explore

6,013 位关注者

LinkedIn collecting ‘Data for Generative AI Improvement’

2024年9月19日

In Consideration of the Role of Data

2024年9月18日

Revolutionize Your Creative Process: From Midjourney to Adobe Firefly

2024年9月14日

Visual Building Blocks

2024年9月10日

Adobe Firefly Workflow?Workshop

2024年9月7日

So... what about AI sound effects? & other Snapshot Summaries

2024年9月6日

A.I. LAB ? Adobe Firefly Workflow Workshop

2024年9月4日

Mockups...

2024年8月28日

Image Editor & More: Midjourney’s Latest Update

2024年8月17日

Repeat + Random + Reuse (Midjourney sref?codes)

2024年8月7日

社区洞察

其他会员也浏览了

This "Hidden" Language Trick Is Used by the World's Most Persuasive People

MarcoverseGPT Pre-Release ?? (Features, bug fixes and species patch)

I asked Chat GPT to interview me and was amused by the results

Who Really Wrote My Article? A Journey with ChatGPT (or Sparky, as I Call It)

?? Product Digest #002: Who Can Use Your Colors, GPT Detector, and Test Your Website from Different Countries

Interview with Bard (Google) about its role and how things will evolve in the future.

Prompt Libraries

Microsoft: "you're using Bing wrong! You're not supposed to talk with it!"

DIY Voice-Controlled GPT Chat: A Step-by-Step Guide for Beginners

THE SECOND WEEK of “THE TRIAD”….What are the 3 methods of becoming a #1 Best selling author in a long term way…answered by ChatGPT