Quantized models of ERNIE-Image / ERNIE-Image-Turbo
- nvfp4 (4.78GB)
- fp8e4m3 (8.22GB)
- int8rowwise (8.22GB) - it needs ComfyUI-INT8-Fast custom node
Generation Speed
ERINE-Image-Turbo
| GPU | Quantization | Speed (it/s) | Time (secs) | vs BF16 |
|---|---|---|---|---|
| RTX 5090 | bf16 | 2.09 | 4.87 | 100% |
| fp8e4m3 | 3.69 | 3.32 | 147% | |
| int8rowwise | 4.31 | 3.05 | 160% | |
| nvfp4 | 5.09 | 2.72 | 179% | |
| RTX 3090 | bf16 | 0.88 | 12.42 | 100% |
| fp8e4m3 | 0.84 | 12.73 | 98% | |
| int8rowwise | 1.66 | 7.04 | 176% | |
| nvfp4 | 0.83 | 12.71 | 98% | |
| RTX 3060 | bf16 | 0.26 | 43.02 | 100% |
| fp8e4m3 | 0.39 | 28.66 | 150% | |
| int8rowwise | 0.82 | 14.43 | 298% | |
| nvfp4 | 0.39 | 28.72 | 150% |
ERINE-Image
| GPU | Quantization | Speed (it/s) | Time (secs) | vs BF16 |
|---|---|---|---|---|
| RTX 5090 | bf16 | 1.08 | 20.08 | 100% |
| fp8e4m3 | 1.97 | 11.67 | 172% | |
| int8rowwise | 2.14 | 10.89 | 184% | |
| nvfp4 | 2.56 | 9.35 | 215% | |
| RTX 3090 | bf16 | 0.40 | 53.33 | 100% |
| fp8e4m3 | 0.39 | 54.71 | 97% | |
| int8rowwise | 0.79 | 28.08 | 190% | |
| nvfp4 | 0.38 | 55.20 | 97% | |
| RTX 3060 | bf16 | 0.11 | 201.41 | 100% |
| fp8e4m3 | 0.17 | 130.48 | 154% | |
| int8rowwise | 0.35 | 62.42 | 323% | |
| nvfp4 | 0.17 | 130.87 | 154% |
Sample
ERNIE-Image-Turbo
ERNIE-Image
How to reproduce
Use https://github.com/bedovyy/comfy-dit-quantizer with the below config json.
{
"block_names": ["layers"],
"rules": [
{ "policy": "keep", "match": ["adaLN", "self_attention.norm"] },
{ "policy": "float8_e4m3fn", "match": ["mlp", "self_attention.to"] }
]
}
- Downloads last month
- -
Model tree for Bedovyy/ERNIE-Image-Quantized
Base model
Comfy-Org/ERNIE-Image






