WaveCut
/

Lens-Turbo-SDNQ-uint4-static

@@ -25,10 +25,11 @@ The first all-linear UINT4 attempt produced periodic grid artifacts and badly de
 ## Visual Comparison
-[Raw comparison grid](https://huggingface.co/WaveCut/Lens-Turbo-SDNQ-uint4-static/resolve/main/assets/comparison/comparison_grid_1to1_q98.webp)
 ![Original vs fixed SDNQ comparison grid](assets/comparison/comparison_grid_1to1_q98.webp)
 ## Quantization Recipe
 | Field | Value |
@@ -104,8 +105,12 @@ Hardware: RunPod NVIDIA H100 80GB HBM3, PyTorch 2.8.0 CUDA 12.8 container, local
 | Load time, seconds | 19.272 | 13.461 |
 | Load peak allocated VRAM, GB | 20.807 | 17.179 |
 | Load peak reserved VRAM, GB | 20.928 | 17.244 |
 | Average prompt runtime, seconds | 1.728 | 3.663 |
 ## 10-Prompt Matrix
 | ID | Scenario | Seed | Original time, s | Quant time, s | Delta | Original peak allocated VRAM, GB | Quant peak allocated VRAM, GB |
@@ -177,8 +182,3 @@ An alternate-history Renaissance laboratory where an astronomer-painter is combi
 ## Notes
 This checkpoint is intended for research and evaluation. It inherits the upstream Lens limitations and responsible AI considerations from the source model. Text rendering remains challenging, but the corrected recipe removes the obvious grid/printed texture failure seen in the all-linear UINT4 attempt.
-## Visual comparison
-**Full-size comparison grid:** the image below is built from native 1024x1024 samples without resampling the image cells and saved as WebP quality 98. Raw file: [assets/comparison/comparison_grid_1to1_q98.webp](https://huggingface.co/WaveCut/Lens-Turbo-SDNQ-uint4-static/resolve/main/assets/comparison/comparison_grid_1to1_q98.webp).

 ## Visual Comparison
+**Full-size comparison grid:** the image below is built from native 1024x1024 samples without resampling the image cells and saved as WebP quality 98. Raw file: [assets/comparison/comparison_grid_1to1_q98.webp](https://huggingface.co/WaveCut/Lens-Turbo-SDNQ-uint4-static/resolve/main/assets/comparison/comparison_grid_1to1_q98.webp).
 ![Original vs fixed SDNQ comparison grid](assets/comparison/comparison_grid_1to1_q98.webp)
 ## Quantization Recipe
 | Field | Value |
 | Load time, seconds | 19.272 | 13.461 |
 | Load peak allocated VRAM, GB | 20.807 | 17.179 |
 | Load peak reserved VRAM, GB | 20.928 | 17.244 |
+| Transformer tensor storage footprint, GB | 16.417 | 4.301 |
+| Transformer storage reduction vs original | baseline | 73.8% smaller |
 | Average prompt runtime, seconds | 1.728 | 3.663 |
+Transformer-only footprint is computed from safetensors tensor storage for the denoising transformer parameter tensors only; it excludes allocator overhead and non-transformer components. The original transformer tensors are F32; the corrected SDNQ transformer stores quantized tensors as U8 plus the excluded modulation layers as BF16.
 ## 10-Prompt Matrix
 | ID | Scenario | Seed | Original time, s | Quant time, s | Delta | Original peak allocated VRAM, GB | Quant peak allocated VRAM, GB |
 ## Notes
 This checkpoint is intended for research and evaluation. It inherits the upstream Lens limitations and responsible AI considerations from the source model. Text rendering remains challenging, but the corrected recipe removes the obvious grid/printed texture failure seen in the all-linear UINT4 attempt.