Comparison of the new fp8 options for XL models #2180

BetaDoggo · 2023-12-04T21:05:11Z

BetaDoggo
Dec 4, 2023

Just a quick comparison of the speed and image quality of the two new fp8 options added today.
Testing was done on an RTX2060 6GB with counterfeitXLv2 as my test model. The same workflow was used for each test.
The workflow I used is very basic and not ideal in most cases. I wanted raw unassisted outputs.

--fp8_e4m3fn-unet:

seed: 420:

27.98s, 25 steps dpm++2m 1.03it/s

--fp8_e5m2-unet

seed: 420:

25.68s, 25 steps dpm++2m 1.09it/s

fp16

seed: 420:

25 steps, speed irrelevant for reasons explained in my notes.

Other notes:

Model loading time was a bit longer that fp16 as expected. I did multiple generations before recording the final speed to ensure that everything was cached. Because of my limited vram, some ram offloading is required for fp16 so it is much slower than both. That test was only done to compare image quality/similarity.

I've yet to find something that doesn't work with the new fp8 options. I've gotten great results when using LCM and turbo.

Conclusion:

They're both fine. Neither is objectively better, each maintains some aspects of the fp16 version that the other lacks. From other seeds I've tried(not included here to avoid filling this post with images), I think I prefer --fp8_e5m2-unet a little bit more. If you have limited vram fp8 is a no-brainer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comparison of the new fp8 options for XL models #2180

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

Comparison of the new fp8 options for XL models #2180

BetaDoggo Dec 4, 2023

--fp8_e4m3fn-unet:

--fp8_e5m2-unet

fp16

Other notes:

Conclusion:

Replies: 0 comments

BetaDoggo
Dec 4, 2023