How to Generate AI Images with Stable Diffusion XL in 5 Minutes

How to Generate AI Images with Stable Diffusion XL in 5 Minutes

Want to generate awesome AI images from your machine?

Stable Diffusion XL is one of the best local image generators out there, and here's how you can set it up in minutes.

Note: you will need a good graphics card for this. A minimum of 4G of VRAM is needed; you'll be much better off with 8G or more.

We will use Stable Diffusion XL on a Linux system. The instructions are the same for Mac or Windows if you use WSL.

Step 1: Create a Python Virtual Environment

Let's set up a Python Virtual Environment. This helps us manage dependencies and keep our project clean.

python -m venv stablediff        

Activate the virtual environment with this command

source stablediff/bin/activate        

You should see the name of your environment in parens before your prompt to know it works:

Now, we'll install our dependencies!

Step 2: Install Dependencies

Next, we need to install our dependencies:

pip install invisible_watermark transformers accelerate safetensors xformers        

Lastly, install diffusers, since it will downgrade packages to make everything work together:

pip install diffusers        

Now you're ready to create your Python file and generate some images!

Step 3: Create a Simple Generator

Create a file named app.py (or whatever you want).

In that file, let's import our libraries:

from diffusers import DiffusionPipeline
import torch        

Next, we want to initialize the pipeline to generate images. We're going to use the stable-diffusion-xl-base-1.0 pretrained model.

We'll set our datatype to float16 for memory efficiency and enable the use of safetensors:

pipe = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, use_safetensors=True, variant="fp16")        

Next, we'll send the pipeline to the GPU:

pipe.to("cuda")        

And next, we'll have the text prompt to send to the model. This can be anything you want.

prompt = "A anthropomorphic poodle riding a dirt bike through the forest"        

One thing you may want to add is enabling xformers for memory efficiency:

pipe.enable_xformers_memory_efficient_attention()        

Now, we can generate our image!

images = pipe(prompt=prompt).images[0]        

Once the image is generated, we can save it:

images.save("output.png")        

Step 4: Run it!

Now we run the file and get a cool image!

You should see something like this at the prompt:

and now we'll have an image (output.png) generated!

Pretty awesome, right?

Using Base + Refiner

You can get even better quality images using an "ensemble of experts" pattern with a base and a refiner:

from diffusers import DiffusionPipeline
import torch

# load both base & refiner
base = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16, variant="fp16", use_safetensors=True)

base.to("cuda")

refiner = DiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-refiner-1.0",text_encoder_2=base.text_encoder_2,vae=base.vae,torch_dtype=torch.float16,use_safetensors=True,variant="fp16",)refiner.to("cuda")

# Define how many steps and what % of steps to be run on each experts (80/20) here

n_steps = 40
high_noise_frac = 0.8

prompt = "A anthropomorphic poodle riding a dirt bike through the forest"

# run both experts

image = base(
    prompt=prompt,
    num_inference_steps=n_steps,
    denoising_end=high_noise_frac,
    output_type="latent",
).images

image = refiner(
    prompt=prompt,
    num_inference_steps=n_steps,
    denoising_start=high_noise_frac,
    image=image,
).images[0]        

This produces some nice results, with settings you can tweak.

Awesome!


Conclusion

This is the easiest, most low-overhead way I know to run Stable Diffusion XL. If you prefer, you can also use the Stable Diffusion Web UI , which gives you easy access to lots of controls and allows you to swap models and refiners easily.

Follow me for more cool stuff like this!

要查看或添加评论,请登录

Jeremy Morgan的更多文章

  • A.I. New Hotness for November 15th, 2024

    A.I. New Hotness for November 15th, 2024

    Welcome to this week's edition of the AI New Hotness Newsletter, where we talk about new and exciting stuff to happen…

  • A.I. New Hotness for November 7th, 2024

    A.I. New Hotness for November 7th, 2024

    Welcome to this week's edition of the AI New Hotness Newsletter, where we talk about new and exciting stuff to happen…

    1 条评论
  • A.I. New Hotness for November 1st

    A.I. New Hotness for November 1st

    Welcome to this week's edition of the AI New Hotness Newsletter, where we talk about new and exciting stuff to happen…

  • Beginner's Guide to Great Prompt Engineering

    Beginner's Guide to Great Prompt Engineering

    Today, we’re going to explore the exciting world of prompt engineering—a critical skill for getting the most out of AI…

  • New Hotness for September 27th

    New Hotness for September 27th

    Welcome to this week's edition of the AI New Hotness Newsletter, where we talk about new and exciting stuff to happen…

  • SEO: You're Doing it Wrong.

    SEO: You're Doing it Wrong.

    I don't do paid guest posting on my blog (JeremyMorgan.com) I repeat I do not do paid guest posting.

    2 条评论
  • Embrace, Don't Hide: Proudly Using AI to Improve Your Work

    Embrace, Don't Hide: Proudly Using AI to Improve Your Work

    Do you use Generative AI tools for your job? Don't be embarrassed or ashamed. Generative AI tools like ChatGPT, Bard…

  • How to Nail Your Next Coding Interview

    How to Nail Your Next Coding Interview

    The room is silent except for the buzzing of the fluorescent lights. The judges across the table are staring at you…

    1 条评论
  • What is a Polyglot Programmer?

    What is a Polyglot Programmer?

    The similarities between spoken or written language and computer code are amazing. Someone writing code can range from…

    1 条评论

社区洞察