?? FIRE YOUR HR MARKETER BECAUSE OF THIS FREAKY TEXT:TO:IMAGE INNOVATION [HOW-TO]

?? FIRE YOUR HR MARKETER BECAUSE OF THIS FREAKY TEXT:TO:IMAGE INNOVATION [HOW-TO]

Every now and then, technological evolution brings new applications of theoretical science.

In this article, I am going to focus on how to generate unique images based on the human natural language (i.e. text-to-image).

Doesn't ring a bell?

Check out these pictures. What do they have in common?

No alt text provided for this image

Nobody painted them. Nobody animated them. Nobody photographed them. AI did.

They have been created by an AI software based on natural language prompts such as Donald Trump, rococo.

Or emilia clarke as lara croft, dirty face, wet hair, covered in filth and mud, rain, wearing military boots.

Or Bill Jobs.

Or the entire universe contained inside a glass jar.

Or skoda car on the wet road, mirror light on puddles, night, futuristic, realistic, high detailed, unreal engine.

These are usually followed with some additional parameters adjusting the size of the output or quality. I will explain that in a bit as well as how you can generate these AI images on your own.

It’s a trendy thing which got into Last Week Tonight show with John Oliver.

So How Can You Do This On Your Own?

Actually, there is not one solution only on the market right now which you can use. There are a few dominant players such as:

●?????Midjourney (my favourite),

●?????DALL-E 2 by OpenAI,

●?????Google Imagen,

●?????Make-A-Scene by Meta

●?????and Stable Diffusion.

Each has its own features, limitations and internal AI logic so you will not be generating the same images even if you will be using the same prompts.


1) Midjourney

To have an idea of what Midjourney can do, go to their Showcase page . Once you log in, you will see even more images including the full text-prompts (i.e. input text queries) which lead to their generation.

No alt text provided for this image

When you want to generate your own images, you just need a Discord app (native app on your mobile phone and computer or e.g. as a web-based app on your desktop or tablet).

As a recruiter or talent sourcer, you should be present on Discord anyhow. It's today's platform to maintain and be a member of various communities - it’s simply something like Slack for the public with some extra features such as that it supports scripting so Midjourney decided to use this public tool as their frontend instead of building their own app.

Once you log into Discord, click on this link to get into the Midjourney community.

You can generate something like 20 images for free so you can start right away. Go to one of the newcomers' rooms named f.e. newbies-130, newbies-160, etc. The great thing is that generating images in the free version is public - you can see what other people generate and what text-to-image prompts they are using so you can use them as an inspiration and build on them.

No alt text provided for this image

They can also see your prompts and images that you generated for free. Once you have the paid account, the image generation is done over Discord messages which is private.

Now, try to generate your first image by typing:

/imagine prompt:<your own text prompt>
        

Once you insert your first prompt, you are served with 4 variations by default in a matter of a few short minutes.

No alt text provided for this image

As you can notice, you get further features right under the image. You can either upscale one or more of the images (the option U1, U2, U3, U4) or you can create variations based on the specific image (V1, V2, V3, V4).

To explain the order of the images:?

No alt text provided for this image

Let’s click on U4 for example. This action will upscale the selected image.

No alt text provided for this image

We can still click on Make Variations to generate other variations of this image. It’s analogical to click on one of the V1, V2, V3, V4 buttons after the first images were generated.

My first prompt to test it out was:

/imagine prompt: Super modern wooden cottage in the snowy mountains with huge glass windows facing coniferous forest, sunset, vivid, detailed, realistic, 4k        
No alt text provided for this image

Then I created variations of the image number 3 by pressing the button V3.

No alt text provided for this image

And upscaled to the final image.

No alt text provided for this image

We can really say that it has everything from what I wanted in my prompt:

  • super modern wooden cottage .. yes
  • in the snowy mountains .. yes
  • with huge glass windows .. yes
  • facing coniferous forest .. yes?
  • sunset .. yes
  • vivid, detailed, realistic, 4k .. yes.

The tricky thing is that for example the sunset is presented as a reflection in the glass of the cottage or almost like the sunset is inside the cottage. You wouldn’t probably do that if I gave such instructions to you. And that’s the beauty of the AI-generated pictures. The creativity of how AI understands your prompts.

I believe that even if you had the same purpose as me, you would get pretty different results. It’s caused by the fact that every word matters and the AI algorithm might be interpreting the text prompts in a totally different direction which is beautiful and scary at the same time because you can get very unexpected results that you wouldn’t even think of.

For instance, do you think that a “hot dog” prompt will make an image of food or an animal?


It’s All About the Prompts

An artist realizes their ideas through a paintbrush. The AI artist does the same through a text prompt and its fine-tuning by continuous iterations and there is some skill in how well you can do that.

We can structure the full text-prompt into 3 parts:

a)???text defining the object (what)

+

b)???text defining the graphics and style of the object (how)

+

c)???extra parameter switches (how)


a) Text Defining the Object

A good start is to open the site Tips for Text-Prompts from the Midjourney User Manual. It suggests some basic rules on how to work with the text.

  • Specify clearly what you want:
  • Avoid: “monkeys doing business”
  • Try: “three monkeys in business suits”

?

  • Speak in positives. Avoid negatives:
  • Avoid: “a hat that’s not red” “
  • Try: “a blue hat”


  • Try to use singular nouns or specific numbers:
  • Avoid: “cyberpunk wizards”
  • Try: “three cyberpunk wizards”

It’s pretty sophisticated AI but when you know a bit of its internals, it’s still using pictures from the internet (DALL-E is using over 650 million of images from the Internet). It means that it will work well for objects defined well visually.

b) Text Defining the Graphics and Style

There are many text prompts which you can use to adjust the appearance of the final image.

Let’s see this slightly ridiculous example:

hot dog octane render, black and white, vibrant, 12th century, ambient occlusion, dynamic lighting, oil        

The part after "hot dog" is the text defining the graphics and style.

You can use text prompts like:

  • Genre: cyber punk style, Pixar movie style, grunge style, etc.
  • Real painters: Andy Warhol style, Salvador Dali, Picasso, etc.
  • Lights and render: glowing lights, blue lighting, octane render, cinematic lighting, unreal engine, 8k, etc.

?And the list goes on. If you want to know more, check out this article .

?c) Extra Parameter Switches

These text-prompts do not support only plain text but also other parameters or switches if you want.

If for instance, we want to generate the picture of a hot dog from the previous example in a 4:3 ratio, we can do the following:

hot dog octane render, black and white, vibrant, 12th century, ambient occlusion, dynamic lighting, oil --ar 4:3        

Or do the same in the size of 500x500 pixels:

hot dog octane render, black and white, vibrant, 12th century, ambient occlusion, dynamic lighting, oil --w 500 --h 500        

You can specify weights of keywords in your prompt. If we mean by our “hot dog” prompt some kind of food, we can decrease the weight of the animal:

hot dog:: animal::-1 food        

There is also a parameter - -no which is the same as the weight -0.5 so you can do this for example:

hot dog --no ketchup        

There are also some special Midjourney parameters to fine-tune your results. With the parameter - -chaos you can increase or decrease the level of abstraction by using a value between 0 and 100:

hot dog --chaos 80        

Another similar parameter is -s to style the output more or less. Or - -creative to let the AI be more creative.

If you want to know more about the possible parameters, type /help in the Midjourney Discord channel or go to the User Manual describing these extra parameters.

No alt text provided for this image


Text-Prompts Wizards

So as you can see, it doesn’t necessarily need to be that simple to get some expected or unexpected, yet still reasonable output. That’s why there are also wizards that can help you with crafting your text-prompts.

One of the tools is Phraser . It simply asks you questions and makes suggestions about what you want to create - what graphics, feelings, etc. Based on that you are served with a complete text-prompt. It supports not only Midjourney but also other AI text-to-image providers from this article.

No alt text provided for this image

Another similar tool is Midjourney Prompt Helper doing the similar thing over a predefined form.

No alt text provided for this image

It also works for DALL-E which I will tell you more about later in this article.

Midjourney Pricing

You get 25 free GPU minutes which means 25 regular images (upscaled images take more of GPU time and variants take less). Then you have plans from $10 a month for 200 GPU minutes monthly up to $600 per year for 120 GPU hours annually.


2) DALL-E 2

DALL-E 2 was recently (Oct 2022) released for a general public so you can try it as well. It works directly from the website or API.

If you have already tried Midjourney project, DALL-E 2 will be pretty straightforward for you even though it has some special features.

Let’s try our first text-to-image prompt.

No alt text provided for this image

We got 4 options.

What we can do is select one of them and edit the picture.

No alt text provided for this image

I’m going to use a brush to erase a part of the image on the cat’s neck.

No alt text provided for this image

And I’m going to change the prompt to:

dramatic cat with VR headset and with a medal of honor on the neck, digital art

Let’s see what we get.

No alt text provided for this image

It seems like a painted cat with a VR headset and a medal of honor on the neck. No painting or animation skills required.

Btw: Microsoft made a huge investment into the DALL-E project and they are coming up with a new AI-based animating tool Microsoft Designer which is using text-to-image technology. You could notice that Midjourney, Stable Diffusion or DALL-E 2 have some problems in common such as human hands (which are often quite crippled) and also a text in the image.

No alt text provided for this image
No alt text provided for this image

That’s why using for example a poster or anything with text is might be a bit challenging. Microsoft Designer is changing this and is putting text-to-image technology into a graphics editor which could actually substitute your graphic designers pretty soon.

DALL-E 2 Pricing

You get 50 free credits for your first month and then 15 credits refill each other month. For $15 you can buy 115 credits which is approx. 460 images.

?

3) Stable Diffusion

Stable Diffusion is a desktop app so you can run the whole algorithm on your computer instead of the cloud. If you have a Mac with an M1 GPU, you can simply install the DiffusionBee app. If you want to install it on Windows, go to this how-to . You can also go directly to the project GitHub page .

Let’s try the same prompt we finished the last time:

dramatic cat with VR headset and with a medal of honor on the neck, digital art        
No alt text provided for this image

You can notice that the quality is different. It also missed the VR headset completely.

Let’s work more with the picture through the Style options. With this option you don’t need to remember all the options or consult with the help pages all the time like for example on Midjourney.

No alt text provided for this image

Under the button Options you can adjust a number of results or the size of the image.

No alt text provided for this image

The results are surprisingly better now.

No alt text provided for this image

Find more Stable Diffusion prompts (plus the options) for your inspiration here .

No alt text provided for this image

As this is using your computer power only, you can generate as many images as you want completely for free.

Stable Diffusion Pricing

You will like this. As you are using your own GPU to calculate the images, it’s free.

?

Image-to-Image

All of the described technologies - Midjourney, DALL-E 2 and Stable Diffusion, support the usage of the existing image or images to create your desired outcome.

Let me give you a few examples from all of the platforms.

DALL-E 2

I uploaded a headshot photo of myself.

No alt text provided for this image

And I added a text prompt to it:

a man presenting in front of a large audience, laser lights, realistic        
No alt text provided for this image

I have seen more variations of this and I suspect that those are really my hands and my regular shirt: ) Anyway, I would use this picture straight away.

Stable Diffusion (DiffusionBee)

Let’s just try to change the style of the image to Van Gogh style.

No alt text provided for this image

Midjourney

Midjourney also supports inserting more than one image. The prompt might look as follows:

/imagine https://www.example.com/image1.jpg https://www.example.com/image2.jpg a box full of chocolates        

This simply takes the images and the text-prompt as the input. If you want your output to be more similar to the image rather than to the text-prompt, increase the weight of the image:

/imagine https://www.example.com/image.jpg a box full of chocolates --iw:4        

??

The Dark Side of AI Images

Firstly, this brings back the question how biased AI is. Some people complain that when you generate CEOs, it gives you mostly white males. Once you generate a nurse, you’re getting mostly a woman of color. AI basically never lies so it’s just proving the current, even though an unwanted model in our society.

Secondly, maybe you are asking yourself: “Can I really generate anything? Anything from my darkest fantasies?”

Yes and no.

The cloud-based algorithms such as Midjourney and DALL-E 2 prohibits usage of some words and there are quite many of them.

Midjourney prohibits your prompts from using these keywords:

  • Gore: blood, kill, car crash, sadist, suicide, etc.
  • Female body parts: booty, big ass, sexy female, hooters, etc.
  • Adultery words: bimbo, brothel, F word, hardcore, hentai, making love, etc.
  • Clothing: bra, naked, nude, lingerie, no shirt, etc.
  • Taboo: nazi, slave, prophet mohammed, etc.
  • Drugs: heroin, meth, crack, etc.
  • People: all pornstar names, hitler, president xi
  • Other: torture, sperm, surgery, poop, etc.

Find more prohibited words in Midjourney here .

You can avoid the Midjourney filter by using other variations of the prohibited word like vaccination instead of vaccine.

The similar situation is with DALL-E 2 even though there is no official list of prohibited words to my knowledge.

Importantly, the biggest difference from Midjourney is that you CANNOT use real names in DALL-E. This means no celebrities or anything like that.

No alt text provided for this image

This is not a problem for Midjourney.

See this robust text-prompt:

Paul rudd with messy hair, Male druid elf, smile, adventurer clothing, smiling Art by artgerm and Greg Rutkowski, Bastien Lecouffe-Deharme, Helnwein, Beeple, Giger, Moebius, Tom Bagshaw and Alphonse mucha and Jesper Ejsing, stunning close up portrait photography of with lots of neon on the background, bioluminescence, ornate, epic composition, textured, insanely detailed, micro details, elegant, strong, ornate, textured, photorealistic, realistic, digital painting, hyperrealistic, Award Winning photography, sharp focus, concept art, league of legends, cinematic soft lighting, smooth, arcade, super sharp features, sharp focus, illustration, unreal engine 5, 8k        
No alt text provided for this image

Although, none of this applies to Stable Diffusion where you can generate anything. You bet I’ve tried!: ) This is where true anarchy still exists.

Naked Brad Pitt? No problem at all: )?

No alt text provided for this image

The image output was partially blurred on purpose for obvious reasons.

?

Other Text-to-Image Generators

To be exhaustive on this topic, I have to say that there are many other applications doing AI text-to-image processing. Some of them use text-prompts and some of them use wizard interface only (we can still consider them as text-to-image).

DALL-E Mini / Craiyon

No alt text provided for this image

GLID-3

No alt text provided for this image

PornPen.ai

If you need to generate some images which might be banned from generating by the mainstream text-to-image generators such as DALL-E 2 or Midjourney, you can either use Stable Diffusion (as described before) or a specialized AI image generator for adult content PornPen.

No alt text provided for this image

The output part of the image output was blurred intentionally for obvious reasons.

If you are asking: Is there a male version of this? I don’t know but you can still use Stable Diffusion with the extra benefit of using real people (see above).

Art Generator

This tool can be run on your virtual Linux machine you got associated with your Google Colab account. Firstly, you need to run all of the scripts (by the “Play” button at each section) from top to down until you get to the section “Parameters for The Art Generator” where you can add your text-prompt and some other parameters.

No alt text provided for this image

Generated Photos

You can either browse through already generated faces and filter them by parameters such as age, ethnicity, eye color, hair color, emotion, etc.

No alt text provided for this image

Or you can generate completely new AI faces in their Face Generator .

No alt text provided for this image

It is similar to This Person Does Not Exist but more sophisticated. You can also generate the whole humans including their bodies.


What’s next?

AI text-to-video! It’s already here in its preliminary stage (Oct 2022) and evolving fast. This music video is for example done by AI.

Runway is a company with promising results.

Meta is trying hard with text-to-video too.


Concept for Recruitment

Let’s put our fantasy on the loose a bit and brainstorm some ideas on how you can make this work for you in recruitment.

A few of my ideas include:

a) Original images to your job ads.

Whenever you create a new job advertisement or job description which you need to support with an image, you or your HR marketing department would probably download something from a stock photo library or craft something using Canva on your own. The first is not so original, nor unique and the latter might be time consuming plus it’s limited by your skills and imagination.

Maybe a mechanical engineer in the automotive industry?

No alt text provided for this image

b) You need some illustrative and realistic photos of people for your career site, job description or maybe a fake LinkedIn profile which you want to use as a honeypot to map the competition.

c) You might also be looking for a new logo for your HR event and you don’t want to rely on a single graphic designer nor the crowdsourcing platforms such as 99designs.com.

d) You might want to edit your current images by erasing or adding things into the image or changing their style completely. I can imagine using this for the LinkedIn photos of your employees too to put them into a unified original style.

The usage of celebrities might be an appealing (and still legal) angle of presenting various job roles. How about Brad Pitt as a nurse?: )

No alt text provided for this image

It needs some more polishing but you get the idea.?

e) Any other branding images which might already be on the brink of an art to support the feeling from the role, company, HR event, etc.

crazy software developer working from home, cyberpunk, fotorealistic, realistic        
No alt text provided for this image

What else?

Good luck with generating your own images. You can easily kill time for a whole week with it.

And feel free to copy & paste your best results into the comments under this article.

??????????????????????????????????????????????????????????????????????????

My LinkedIn newsletter evolves on the legacy of one of the first talent sourcing books on the market ???People as Merchandise ??? which was the very original seed for establishing the 150-employee sourcing power house ???GoodCall ??? with the two wing companies ???Datacruit ATS ??? and ???Recruitment Academy ??? awarded as 415th fastest growing company in Europe by the?Financial Times.

If you relate with this next-gen content about the next-generation talent sourcing techniques, ????follow my LinkedIn profile ??? for more. 9,95 out of 10 claim that it’s poisonously delicious ??

Oss! ??

Recruitment Academy Certified Sourcer + Recruiter training
Ingrid ?vá?ková

CEO - Cor Sapentia z.s. v oblasti celo?ivotního vzdělávání s d?razem na rozvoj kompetencí pro pracovní trh

1 年

??

Katarzyna Patrysiuk

Recruitment Team Lead in w Lean HR Consulting

1 年

Great post!

Sanja To?i?

Gender and Social Inclusion Expert

2 年

Amazing, many thanks!

要查看或添加评论,请登录

Josef José Kadlec的更多文章

社区洞察