MXFP8?
#2
by eepos - opened
Thanks for the informative generation speed table!
Are you planning on doing an MXFP8 quant? I'm curious to see how it would do in speed (and quality) comparison.
I didn't upload mxfp8 because
- It only works fast on blackwell GPU
- fp8 model seems okay to me.
Here's Quick test of mxfp8 on 5090.
Model ErnieImage prepared for dynamic VRAM loading. 7834MB Staged. 0 patches attached.
100%|ββββββββββββββββββββββββββββββββββββββ| 20/20 [00:10<00:00, 1.93it/s] # fp8
Requested to load ErnieImage
Model ErnieImage prepared for dynamic VRAM loading. 8068MB Staged. 0 patches attached.
100%|ββββββββββββββββββββββββββββββββββββββ| 20/20 [00:10<00:00, 1.84it/s] # mxfp8
Requested to load ErnieImage
Model ErnieImage prepared for dynamic VRAM loading. 15322MB Staged. 0 patches attached.
100%|ββββββββββββββββββββββββββββββββββββββ| 20/20 [00:19<00:00, 1.05it/s] # bf16
mxfp8 is slightly bigger and slightly slower than fp8.
ERNIE-Image
ERNIE-Image-Turbo
Both mxfp8 and fp8 seems okay to me.
By the way, You can quantize mxfp8 using comfy-dit-quantizer. (It takes only ~1min.)
- clone comfy-dit-quantizer.
- copy
configs/ernie-image-fp8.jsonand changefloat8_e4m3fntomxfp8in the file. - activate ComfyUI's venv.
python quantize.py configs/ernie-image-mxfp8.json <BF16 MODEL> <MXFP8 MODEL>
Thanks for the info. Looks like not much to gain with MXFP8 over FP8.
I tried to quantize earlier but I have ComfyUI portable, not a venv, and ran into some issues. Will try to troubleshoot with an LLM later.
eepos changed discussion status to closed

