Development Plan #166

lllyasviel · 2024-02-10T07:41:07Z

lllyasviel
Feb 10, 2024
Maintainer

Forge is a platform on top of Stable-Diffusion-WebUI to make speed faster and make development easier. We will get all updates from the dev branch of original webui automatically with bots, and we do not have any motivation or plan to compete with original webui.

Another reason is that we have several ongoing research projects planned, and we want to use this very friendly webui that everyone love, but we really do not want users to be disappointed by the speed and performance, especially when several our future works will be based on SDXL.

We promise that, when the following (1) and (2) happen together, this repo will immediately change to an extension of standard Stable-Diffusion-WebUI from Automatic1111, that is to say, immediately join back to the original sd-webui ecosystem of Automatic1111 when the following 2 things happen together:

On at least 4 of our 5 test devices (RTX 2060, RTX 3060 laptop, RTX 3090, RTX 4090, RTX 3070ti laptop), if the original webui is equally fast or at most 10% slower (Except RTX 3090 and 4090 since the speed up for them are around 5%). All CMD flags are accepted, but we exclude tech that sacrifices functionality like TensorRT or torch compile. Note that this includes a full generation pass of CLIP time, diffusion time, model moving time, and VAE time. We use SDXL 1024x1024 at batch size 1 and 4, and with batch count 1 and 16.
On at least 4 of our 5 test devices (RTX 2060, RTX 3060 laptop, RTX 3090, RTX 4090, RTX 3070ti laptop), if the original webui is equally memory efficient or at most use 512MB more VRAM, and the diffusion image resolution without OOM is at least 90% of Forge’s max resolution (On RTX 3060 laptop and RTX 3070ti laptop). All CMD flags are accepted.

If the above two things happen together, we will immediately move back to original sd-webui ecosystem, and this repo will be made into an extension. Do not worry about engineering complexity - we have excellent team that can solve every problem.

We will test every 15 days with original webui dev branch, or anyone can inform us to test, whenever by replying to this post.

Before that, all major updates from us will happen here and we will wait upstream to resolve memory/speed issues. The tool set of Forge after 0.0.10 is relatively complete now. Users should not have big problems in image processing tasks.

Note that Forge backend API will not be modified even if we change to an extension.

lllyasviel

gin063 · 2024-02-10T08:20:44Z

gin063
Feb 10, 2024

Thank you for your team's efforts, will Fooocus continue to be updated? It seems like you have quite a few projects currently.

1 reply

lllyasviel Feb 10, 2024
Maintainer Author

Yes for sure

lllyasviel · 2024-02-10T08:55:12Z

lllyasviel
Feb 10, 2024
Maintainer Author

By the way, there are some wrong tests on the internet that run original webui and forge at the same time, and say their speeds are similar. This is wrong because if you open original webui and forge at the same time, the origin will just use all possible GPU, but the forge will detect that, automatically change to a slower method using less VRAM, so that Forge can help you to generate images without OOM on both software. (Because if forge use normal speed with more vram, the origin will OOM)

To see the speed up, you need to at least close original webui, so that forge can use your GPU normally.

7 replies

lllyasviel Feb 10, 2024
Maintainer Author

Please note that soft inpainting feature comes from Automatic1111 dev team, not from Forge.
Credits should be given to Automatic1111 dev team

IndigoDosSantos Feb 10, 2024

My bad for confusing it 😅 Great work with forge nevertheless!

tusharbhutt Feb 11, 2024

I am not getting an speed difference at all when doing a standard 1024x1024 image at ~40 steps. A batch of four takes about 2:15 on AUTO1111 vanilla, and within a second or so on Forge with identical parameters, extensions, samplers, etc., etc.

I am not running both at the same time, but VENV is shared, if that makes a difference. This is on a bog standard 3060 with 12 GB. If anyone has any tips or pointers, I am all ears.

strawberrymelonpanda Feb 11, 2024

For me it's pretty good:
SDXL: Roughly 0.95 it/s in A1111 to 1.16 it/s in Forge, but wall clock time down from 41s to 31.5s.
SD1.5: 5.22 it/s in A1111 -> 5.6 it/s in Forge. Roughly the same wall clock time.

VRAM drop in both, but noticeable in XL. From about 7.6 GB in A1111 to about 7 GB in Forge. I think I have my settings pretty well tuned in A1111, so any speed-up is just icing on the cake.

The real benefit to me VRAM savings and the extension support. Been hoping for Marigold, IP-Adapter masking, etc.
(Thanks lllyasviel and contributors!)

skylerblack2 Feb 21, 2024

I'm on a 4090 and between the faster initial startup, the less frequent OOM errors, and the fact that everything seems to just respond a bit quicker, I've found myself naturally wanting to work out of webui forge now. I haven't tested them head to head and I'm not sure I'd even notice a 5% improvement in generation speed. But I think this misses the point that everything feels faster and less frustrating on webui forge.

Thank you to you and the team for building this and putting it out there!

LastTargaryen · 2024-02-10T09:46:43Z

LastTargaryen
Feb 10, 2024

2 replies

lllyasviel Feb 10, 2024
Maintainer Author

problems should go to issue

LastTargaryen Feb 10, 2024

Moved as indicated, sorry for that. Thanks

jyoung105 · 2024-02-10T11:22:13Z

jyoung105
Feb 10, 2024

Superb work!

0 replies

PDCWolf · 2024-02-10T14:45:02Z

PDCWolf
Feb 10, 2024

Whilst the work done is amazing, it should be made really clear on the project's readme that the test devices for this project do not include AMD cards, and that currently DirectML support is non existent. This repo has been suggested in multiple places as, at least, an option for AMD owners but it is visible such a thing shouldn't have been done.

3 replies

ChiaYen-Kan Feb 12, 2024

Does the current version not support DirectML?
i use 5600g and try install stable-diffusion-webui-forge , but got error

PDCWolf Feb 12, 2024

#58 and #73 are making it unusable for me. If I launch it ends at "TypeError: 'NoneType' object is not iterable", and if it's not that, it's somewhere else.

ChiaYen-Kan Feb 13, 2024

i read the issue #58, is seem hard to install stable-diffusion-webui-forge with DirectML

tusharbhutt · 2024-02-11T15:43:06Z

tusharbhutt
Feb 11, 2024

For me it's pretty good: SDXL: Roughly 0.95 it/s in A1111 to 1.16 it/s in Forge, but wall clock time down from 41s to 31.5s. SD1.5: 5.22 it/s in A1111 -> 5.6 it/s in Forge. Roughly the same wall clock time.

So either my vanilla AUTO1111 conifg is optimal, or my Forge one is misconfigured, wonder which one it is :)

I can get 25% higher resolution before OOM, so that's good, but I wish the speed was more apparent.

1 reply

strawberrymelonpanda Feb 11, 2024

Yeah, sorry - I'd love to help but there's so many variables on what it could be. Hardware differences having more/less room to optimize (My card's 8GB), config settings, or even just use case.

As a suggestion, you could maybe start a benchmark thread with info and results, for comparison.

AlluraHikaru · 2024-02-11T17:16:52Z

AlluraHikaru
Feb 11, 2024

This is so SO good for people with 8gb of VRAM (3070ti in my case). On 1111 with medvram even YouTube playback lags when generating. Forge allowed me to UP the resolution that I'm generating at AND watch YouTube/local Video normally. Thank you!

0 replies

Moussack · 2024-02-14T17:55:33Z

Moussack
Feb 14, 2024

Currently testing controlnet IPadapter faceID with my ancient gpu GTX 970 4GB and the result is amazing, With OG A1111 it takes 3-4 minutes to generate 1 image. With forge it takes only 20-30 seconds.

0 replies

strawberrymelonpanda · 2024-02-15T00:04:22Z

strawberrymelonpanda
Feb 15, 2024

We will get all updates from the dev branch of original webui automatically with bots, and we do not have any motivation or plan to compete with original webui.

@lllyasviel Any consideration for a branch, perhaps by default, pulling from Main rather than Dev? I have had few, if any, problems with Dev - but certainly seems unintended to be a base for projects and could potentially lead to breakage.

That said, it'd be nice if Dev moved over to Main a little more routinely. Can't say why it's been 2+ months since the last release when Dev has great features and feels stable, except that making mainline releases certainly takes time and energy.

0 replies

MysticDaedra · 2024-02-18T22:53:07Z

MysticDaedra
Feb 18, 2024

Is there a Discord or something for discussing Forge and troubleshooting?

0 replies

myth0er · 2024-02-19T11:59:31Z

myth0er
Feb 19, 2024

I have tested with these "--directml --skip-torch-cuda-test --always-normal-vram --skip-version-check" commandline args, and It's the same speed if not slower than A1111. But I have succesfully created 1920x1080(with upscale or without) images around 5 mins. I could't use higher res in A1111 due to out of memory issues. I have AMD RX 6700 XT with 12GB.

Also I can't preview the image while processing and I don't know it is intended or not. Thanks.

1 reply

ZeroNyte Mar 11, 2024

how'd you get it to work with an AMD card? when I try it I just get: "ModuleNotFoundError: No module named ''torch_directml'"

wanted to try forge as A1111 had problems for me with certain resolutions, and got forge recommended.

BenDes21 · 2024-02-19T14:44:06Z

BenDes21
Feb 19, 2024

no infos about rtx4080 ?

0 replies

TigerHH6866 · 2024-02-23T05:57:00Z

TigerHH6866
Feb 23, 2024

really amazing work
goooood for 20xx~~30xx~~A4/500, large pixel is ok, generated 1536x 1536 batch4 is done using 3080 10G
and faceid do better than CN in webui (look like insigeface+clip preprocesser is better than ipadtar-faceid/plus)
and loading models faster than webui

0 replies

yamfun · 2024-02-25T07:39:11Z

yamfun
Feb 25, 2024

Does Forge support Cascade?

0 replies

Manchovies · 2024-02-26T22:06:16Z

Manchovies
Feb 26, 2024

This has been huge for me! I'm shouting praises about forge whenever possible. Thank you for making my dinky lil' 2070 Super a whole lot more useful! Thank you also for all your hard work on everything else you do. Much appreciated.

0 replies

AG-w · 2024-03-28T11:53:26Z

AG-w
Mar 28, 2024

What exactly is forge in a1111's extension form?

we do not have any motivation or plan to compete with original webui

well some extensions are already incompatible between forge and a1111, aren't that enough to make 2 client compete each other?

2 replies

LastTargaryen Mar 30, 2024

Soooooo TRUE.

AG-w Apr 3, 2024

There's no need for performance benchmark or make it more complicated,
just make forge an A1111 extension or compete with A1111 like all the other clients as usual.

LadyFlames · 2024-04-06T23:50:17Z

LadyFlames
Apr 6, 2024

what about Stable Cascade and Wurstchen any plans for that because around the internet people are asking about it like since there are some models for it already why isnt there a webui or a port for it like merge Forge or something like it with Cascade+Würstchen and the upcoming SD3 its just something people are asking about at times theres Also This thats Coming https://ella-diffusion.github.io/ from what people have tested its 26times better than Clip and Open Clip combined and its already out For comfy UI https://github.com/TencentQQGYLab/ELLA https://github.com/Stability-AI/StableCascade https://stability.ai/news/introducing-stable-cascade https://huggingface.co/blog/wuerstchen

0 replies

GuohuaPc · 2024-06-11T17:34:12Z

GuohuaPc
Jun 11, 2024

我的3070 8G,用forge最好用的就是不爆显存，真的支持!

0 replies

CantainPike · 2024-06-11T22:26:32Z

CantainPike
Jun 11, 2024

I don't think so Forge will support those new models any time soon. ComfyUI, and SDNext support most of the models, StableSwarm still in beta but supports StableCascade, Playground 2.5, SD3. Automatic1111 does not support any of those new models too.

1 reply

newxhy Jun 13, 2024

lllyasviel的另一个项目fooocus已经支持Playground 2.5了，估计很快就会支持SD3，forge 短期内应该不会更新了

Development Plan #166

lllyasviel Feb 10, 2024 Maintainer

Replies: 19 comments · 18 replies

lllyasviel Feb 10, 2024 Maintainer Author

lllyasviel Feb 10, 2024 Maintainer Author

lllyasviel Feb 10, 2024 Maintainer Author

lllyasviel Feb 10, 2024 Maintainer Author

lllyasviel
Feb 10, 2024
Maintainer

Replies: 19 comments 18 replies

lllyasviel Feb 10, 2024
Maintainer Author

lllyasviel
Feb 10, 2024
Maintainer Author

lllyasviel Feb 10, 2024
Maintainer Author

lllyasviel Feb 10, 2024
Maintainer Author