Qwen2.5-0.5B-PreSINQ GGUF
Pre-SINQ (Sinkhorn Normalization) applied to Qwen2.5-0.5B, converted to GGUF and quantized.
What is Pre-SINQ?
Pre-SINQ applies Sinkhorn-inspired weight reparameterization to make model weights easier to quantize. The model output is mathematically identical to the original - no accuracy loss.
Available Quantizations
| File | Size | Quality |
|---|---|---|
qwen25-0.5b-presinq-f16.gguf |
949M | Perfect (reference) |
qwen25-0.5b-presinq-q8_0.gguf |
507M | Perfect |
qwen25-0.5b-presinq-q6_k.gguf |
483M | Perfect |
qwen25-0.5b-presinq-q5_k_m.gguf |
401M | Perfect |
qwen25-0.5b-presinq-q5_k_s.gguf |
394M | Perfect |
qwen25-0.5b-presinq-q5_1.gguf |
400M | Perfect |
qwen25-0.5b-presinq-q5_0.gguf |
379M | Perfect |
qwen25-0.5b-presinq-q4_k_m.gguf |
380M | Perfect |
qwen25-0.5b-presinq-q4_k_s.gguf |
368M | Perfect |
qwen25-0.5b-presinq-q4_1.gguf |
358M | Good |
qwen25-0.5b-presinq-q4_0.gguf |
336M | Good |
All quantizations Q5_0 and above produce identical output to F16. Q4 variants show minimal degradation.
Usage with prima.cpp
# Download a quantization
# Then run:
llama-cli -m qwen25-0.5b-presinq-q4_k_m.gguf -p "Hello" -n 64
Based On
License
Apache 2.0
- Downloads last month
- 331
Hardware compatibility
Log In to add your hardware
4-bit
5-bit
6-bit
8-bit
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support