Great model quality, but desperately needs universal quantization/optimization (took 35 mins

by Spawn - opened Feb 24

Feb 24

The image quality from this model is truly impressive, but the generation time is practically unusable right now. It took me about 35 minutes to generate a single image.

I really appreciate the effort to port this and make it compatible with ComfyUI. However, we desperately need universally compatible quantization or compute optimizations, rather than implementations that only benefit specific, latest-generation NVIDIA GPUs.

I am currently running an AMD Radeon 7900 XTX with 64GB of system RAM, and the current FP8 implementation causes massive memory swapping and severe bottlenecking. It would be amazing to see a more universally supported quantization format (like standard INT8 or GGUF) so that users with diverse hardware setups can actually utilize this fantastic model.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment