登录查看更多内容

Fun with Stable Diffusion, Forge and Flux on AI generating images

Reiner Merz

certified AI and Opentext Content Server developer

发布日期: 2025年2月20日

+ 关注

What is Stable Diffusion?

From Wikipedia here

Stable Diffusion is a deep learning, text-to-image model released in 2022 based on diffusion techniques. The generative artificial intelligence technology is the premier product of Stability AI and is considered to be a part of the ongoing artificial intelligence boom.

It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt.[3] Its development involved researchers from the CompVis Group at Ludwig Maximilian University of Munich and Runway with a computational donation from Stability and training data from non-profit organizations.[4][5][6][7]

Stable Diffusion is a latent diffusion model, a kind of deep generative artificial neural network. Its code and model weights have been released publicly,[8] and it can run on most consumer hardware equipped with a modest GPU with at least 4?GB VRAM. This marked a departure from previous proprietary text-to-image models such as DALL-E and Midjourney which were accessible only via cloud services.[9][10]

(All links and footnotes refer to the original article)

What is Forge?

from Installation and use of Forge, a simple and efficient drawing tool that is better than WebUI here

The stable-diffusion-webui-forge tool is aStable Diffusion WebUI (based on Gradio)AI Drawing ToolsThe platform aims to simplify plugin development, optimize resource management, and accelerate inference.ForgeThe name was inspired byMinecraft Forge". The goal of this project is to become the Forge of SD WebUI. Forge promises to alwaysNo unnecessary changes will be added to the Stable Diffusion WebUI user interfaceFor those who are familiar with Stable Diffusion WebUI, they can use their experience with Automatic1111 WebUI to quickly get started with the operation of Forge.

Off topic: Forge author has been active inAIGCDrawing community. He has successively open-sourced excellent open-source software of ControlNet and Foooucs communities, and recently he has invested in the development of Forge, aiming to simplify the entry threshold of AIGC drawing for novices.

At a resolution of 1024px image quality, Forge can achieve a significant performance acceleration compared to the original WebUI in terms of SDXL model inference rate.

(All links and footnotes refer to the original article)

What is flux?

from Wikipedia here

Flux (also known as FLUX.1) is a text-to-image model developed by Black Forest Labs, based in Freiburg im Breisgau, Germany. Black Forest Labs were founded by former employees of Stability AI. As with other text-to-image models, Flux generates images from natural language descriptions, called prompts.

Flux is a series of text-to-image models. The models are based on a hybrid architecture that combines multimodal and parallel diffusion transformer blocks scaled to 12?billion parameters.

According to a test performed by Ars Technica, the outputs generated by Flux.1 Dev and Flux.1 Pro are comparable with DALL-E 3 in terms of prompt fidelity, with the photorealism closely matched Midjourney 6 and generated human hands with more consistency over previous models such as Stable Diffusion XL.[32]

Flux has been criticised for its very realistic generated images. According to media reports, depictions ranged from an image of Donald Trump posing with guns to disturbing scenes, which triggered discussions about ethical implications of technologies developed by Black Forest Labs.[4][13]

After the release of the model, social media X was flooded with Flux-generated images.[33][34] Black Forest Labs has not provided exact details of the data used to train the model.[29] Ars Technica suspected that Flux is based on a large, unauthorised collection of images scraped from the internet, a controversial practice with potential legal consequences.[32][35]

(All links and footnotes refer to the original article)

Where to find and download this components?

stable-diffusion-webui-forge：https://github.com/lllyasviel/stable-diffusion-webui-forge
Installation package download：https://github.com/lllyasviel/stable-diffusion-webui-forge/releases/download/latest/webui_forge_cu121_torch21.7z

Beware: The 7z file has aound 2 GB, the complete Installation around 96 GB

Then you need the model flux

Hugging Face https://huggingface.co/black-forest-labs/FLUX.1-dev

Civitai https://civitai.com/models/618692/flux (you need to login there with your email adress)

Github https://github.com/black-forest-labs/flux

Beware: flux has ca 20 GB (dev) and several LORAs add further, depending on your need. The sdxl model at https://civitai.com/models/101055?modelVersionId=128078 needs only 200MB)

Remark: All VAEs and Text encoders should download automatically, if not, the installation will complain and you should try to download the missing files from Github or HuggingFace.

What are the requirements?

I am using a HP z44o workstation with 64 GB Memory and a XEON E5-1650 V3 @ 3,5GHZ and Windows 23 H2.

I use a NVIDIA RTX A4000 with 16 GB VRAM.

On my experience, the A4000 makes the speed, the CPU is not so involved.

The difference is around 5-10 min with 6 GB VRAM and 25sec-1:30 min with 16 GB VRAM.

A glimpse of Image Generation

Here I can show you a short glimpse, for a further deep dive, refer to my articles (in the future).

Start it by pressing [iknstalldir]\webui_forge_cu121_torch21\webui\webui-user.bat (or webui-user.sh if you are on Linux)

The forge in a browser looks like

Here, the flux is preselected.

Lets try one prompt:

Enter as prompt tall skinny supermodel looking softly at the viewer, fashionable clothing with a hint of innocence and grace and press Generate

This would calculate in the w′cnd window

Distilled CFG Scale: 3.5
[Unload] Trying to free 30800.42 MB for cuda:0 with 0 models keep loaded ... Current free memory is 5530.43 MB ... Unload model JointTextEncoder Done.
[Memory Management] Target: KModel, Free GPU: 15179.04 MB, Model Require: 22700.13 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: -8545.10 MB, CPU Swap Loaded (blocked method): 9882.00 MB, GPU Loaded: 12818.13 MB
Moving model(s) has taken 45.94 seconds
100%|██████████████████████████████████████████████████████████████████████████████████| 20/20 [01:28<00:00,  4.44s/it]
[Unload] Trying to free 4495.77 MB for cuda:0 with 0 models keep loaded ... Current free memory is 2082.06 MB ... Unload model KModel Done.
[Memory Management] Target: IntegratedAutoencoderKL, Free GPU: 15154.58 MB, Model Require: 159.87 MB, Previously Loaded: 0.00 MB, Inference Require: 1024.00 MB, Remaining: 13970.71 MB, All loaded to GPU.
Moving model(s) has taken 8.54 seconds
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [01:07<00:00,  3.37s/it]
Total progress: 100%|██████████████████████████████████████████████████████████████████| 20/20 [01:07<00:00,  2.97s/it]

and the image generated is

要查看或添加评论，请登录

Reiner Merz的更多文章

The Basics - Installing CSIDE in the Content Server

2025年3月10日

The Basics - Installing CSIDE in the Content Server

The different Architectures Different Architectures - Made by the Easy Diffusion AI The basic architecture of the…
Playing with Business Relationships

2025年3月9日

Playing with Business Relationships

Business Relationships allow for a relationship between business workspaces. Multiple Sales Order Workspaces might be…
Business Object Declarations for XECM

2025年3月6日

Business Object Declarations for XECM

Lets do a short dive in declaring the Business Object. In order to enable Business Attachments, a Business Object…

1 条评论
Business Attachments - Useful in Extended ECM for SAP

2025年3月5日

Business Attachments - Useful in Extended ECM for SAP

What is a business attachment? A business attachment is an item in Content Server that is linked to an SAP Business…
Playing with emails in the Content Server

2025年3月2日

Playing with emails in the Content Server

This is a very basic topic. Add email folders - Made by the Easy Diffusion AI Saved Exchange email (i.
Dealing with POP3 emails - Class POP3session in oScript

2025年2月27日

Dealing with POP3 emails - Class POP3session in oScript

Whar is POP3? Lets quote Wikipedia from this article " The Post Office Protocol provides access via an Internet…
Fun with AI Images using Flux.dev and Forge Part III

2025年2月26日

Fun with AI Images using Flux.dev and Forge Part III

Here we will discuss some of the parameters and lets draw a fantasy warrior (not Tulsi Gabbert!) From my article 20.2.
Fun with AI Images using Flux.dev and Forge Part II

2025年2月24日

Fun with AI Images using Flux.dev and Forge Part II

From my article 20.2.
Class OTDS - Doing some magic in oScript

2025年2月19日

Class OTDS - Doing some magic in oScript

The OTDS oScript package provides services for manipulating authentication with Open Text Directory Services. It is…
Fun with advanced Searches from an Admin point of vue

2025年2月18日

Fun with advanced Searches from an Admin point of vue

Lets have some fun on the Administrators Work upon the Search-Matrix in the Content Server Search Path Management…

See all articles

What is Stable Diffusion?

What is Forge?

What is flux?

Where to find and download this components?

What are the requirements?

A glimpse of Image Generation

Reiner Merz的更多文章

The Basics - Installing CSIDE in the Content Server

Playing with Business Relationships

Business Object Declarations for XECM

Business Attachments - Useful in Extended ECM for SAP

Playing with emails in the Content Server

Dealing with POP3 emails - Class POP3session in oScript

Fun with AI Images using Flux.dev and Forge Part III

Fun with AI Images using Flux.dev and Forge Part II

Class OTDS - Doing some magic in oScript

Fun with advanced Searches from an Admin point of vue