Naming notice (2026-04-10). The "HLWQ" technique used in this model is being rebranded to HLWQ (Hadamard-Lloyd Weight Quantization). The change is only the name; the algorithm and the weights in this repository are unchanged.
The rebrand resolves a name collision with an unrelated, earlier KV cache quantization method also named HLWQ (Han et al., arXiv:2502.02617, 2025). HLWQ addresses weight quantization with a deterministic Walsh-Hadamard rotation and Lloyd-Max scalar codebook; Han et al.'s HLWQ addresses KV cache quantization with a random polar rotation. The two methods are technically distinct.
Existing loaders that load this repository by ID continue to work without changes. Future model uploads will use the HLWQ name.
Reference paper for this technique: arXiv:2603.29078 (v2 in preparation; v1 still uses the old name).
VOID (Netflix) β HLWQ Q5 (Bit-Packed)
PQ5 quantized Netflix VOID β remove objects from video while preserving physical interactions.
42 GB β 13 GB (-69%) | cos_sim 0.9986 | 506 layers quantized
Download Size
Compression
| Component | Original | PQ5 Packed | Reduction |
|---|---|---|---|
| VOID Pass 1 (inpainting) | 11.1 GB | 4.7 GB | -58% |
| VOID Pass 2 (refinement) | 11.1 GB | 4.7 GB | -58% |
| T5 Text Encoder | 9.5 GB | 3.1 GB | -67% |
| VAE | 0.4 GB | 0.4 GB | BF16 |
| Total | ~42 GB | ~13 GB | -69% |
Quick Start (3 commands)
# 1. Install
pip install safetensors huggingface_hub scipy diffusers transformers accelerate
# 2. Download & setup (13 GB)
git clone https://huggingface.co/caiovicentino1/VOID-Netflix-HLWQ-Q5 ./VOID-PQ5
cd VOID-PQ5 && python setup.py
# 3. Run inpainting
python generate_void.py --sample lime # built-in sample
# Or with your own video:
python generate_void.py --video input.mp4 --mask mask.mp4 --prompt "empty room"
What is VOID?
VOID removes objects from video while preserving the physics of the scene:
- Person removed β objects they were holding fall naturally
- Object removed β surrounding items shift realistically
- Uses quadmask conditioning (4-value mask: remove/overlap/affected/keep)
Two-pass pipeline:
- Pass 1: Base inpainting (sufficient for most videos)
- Pass 2: Optical flow-warped refinement for temporal consistency
Architecture
- CogVideoX 3D Transformer (5B params)
- 42 layers, 48 heads, head_dim=64
- Input: video + quadmask + text prompt
- Resolution: 384x672, up to 197 frames
- DDIM scheduler
Hardware
| GPU | VRAM | Status |
|---|---|---|
| RTX 4090 (24 GB) | 24 GB | Fits after PQ5 dequant |
| A100 (40 GB) | 40 GB | Recommended |
| RTX 3090 (24 GB) | 24 GB | Fits with offloading |
Files
VOID-Netflix-HLWQ-Q5/
βββ setup.py # One-command setup
βββ generate_void.py # Easy inference wrapper
βββ polarquant/
β βββ codes/
β β βββ void_pass1_codes.safetensors (3.1 GB)
β β βββ void_pass2_codes.safetensors (3.1 GB)
β β βββ text_encoder_1_codes.safetensors (1.6 GB)
β β βββ text_encoder_2_codes.safetensors (1.5 GB)
β βββ bf16/
β βββ void_pass1_bf16.safetensors (1.6 GB)
β βββ void_pass2_bf16.safetensors (1.6 GB)
β βββ text_encoder_*_bf16.safetensors
βββ vae/diffusion_pytorch_model.safetensors (0.4 GB)
βββ transformer/config.json
βββ text_encoder/config.json
βββ tokenizer/
βββ scheduler/
Links
Citation
@article{polarquant2026,
title={HLWQ: Hadamard-Rotated Lloyd-Max Quantization},
author={Vicentino, Caio},
journal={arXiv preprint arXiv:2603.29078},
year={2026}
}
@misc{motamed2026void,
title={VOID: Video Object and Interaction Deletion},
author={Motamed, Saman and Harvey, William and Klein, Benjamin and Van Gool, Luc and Yuan, Zhuoning and Cheng, Ta-Ying},
year={2026},
eprint={2604.02296},
archivePrefix={arXiv}
}
42 GB β 13 GB with cos_sim 0.9986. Quantized with HLWQ.
- Downloads last month
- 157
Model tree for caiovicentino1/VOID-Netflix-HLWQ-Q5
Base model
netflix/void-model
