This is the best model I've seen.It is almost real-time.

by tonera - opened 12 days ago

This is the best model I've seen, a perfect fit for svdquant. Quantization accuracy is virtually lossless. Here are the quantization results:

Metric	Mean	Median p50	p90
PSNR	22.10	22.35	25.58
SSIM	0.876	0.877	0.933
LPIPS	0.0734	0.0714	0.114

The quantized version uses only 15.21GB of VRAM at its peak, and on an RTX 5090, the texturing is almost real-time.

Performance benchmarks (RTX 5090 32GB, 8 steps, guidance scale = 1.0)

Setup	Mode	Peak VRAM	Throughput	Time to image	Throughput change vs. base	VRAM change vs. base
`Base`	`CUDA`	OOM	-	-	Baseline unavailable	Baseline unavailable
`Base`	`MCO`	20.18 GB	0.62 it/s	12 s	Baseline	Baseline
`Base`	`SCO`	2.55 GB	0.48 it/s	16 s	Baseline	Baseline
`Base + TE`	`CUDA`	25.60 GB	1.28 it/s	6 s	N/A (base OOM)	N/A
`Base + TE`	`MCO`	20.16 GB	0.83 it/s	9 s	+33.9%	-0.1%
`Base + TE`	`SCO`	6.08 GB	0.48 it/s	16 s	-0.5%	+138.4%
`Base + TR`	`CUDA`	24.51 GB	3.79 it/s	2 s	N/A (base OOM)	N/A
`Base + TR`	`MCO`	17.39 GB	1.08 it/s	7 s	+75.0%	-13.8%
`Base + TR`	`SCO`	4.35 GB	2.70 it/s	2 s	+461.6%	+70.6%
`Base + TR + TE`	`CUDA`	14.00 GB	3.81 it/s	2 s	N/A (base OOM)	N/A
`Base + TR + TE`	`MCO`	7.52 GB	1.88 it/s	4 s	+204.6%	-62.7%
`Base + TR + TE`	`SCO`	7.69 GB	2.68 it/s	2 s	+457.4%	+201.6%

Setup	Mode	Peak VRAM	Throughput	Time to image	Throughput change vs. base	VRAM change vs. base
`Base`	`CUDA`	OOM	-	-	Baseline unavailable	Baseline unavailable
`Base`	`MCO`	18.53 GB	0.83 it/s	9 s	Baseline	Baseline
`Base`	`SCO`	2.55 GB	0.62 it/s	12 s	Baseline	Baseline
`Base + TR + TE`	`CUDA`	15.21 GB	8.91 it/s	<1 s	N/A (base OOM)	N/A
`Base + TR + TE`	`MCO`	6.42 GB	2.60 it/s	3 s	+214.5%	-65.4%
`Base + TR + TE`	`SCO`	7.72 GB	3.00 it/s	2 s	+383.0%	+202.7%

Base = black-forest-labs/FLUX.2-klein-9b-kv
TE = tonera/Qwen3-text-Nunchaku
TR = tonera/FLUX.2-klein-9b-kv-Nunchaku/svdq-{precision}_r32-FLUX.2-klein-9b-kv-Nunchaku.safetensors
MCO = enable-model-cpu-offload
SCO = enable-sequential-cpu-offload

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment