PreSINQ GGUF Quantized Qwen3-0.6B Model
This repository contains the official PreSINQ GGUF-quantized versions of the Qwen3-0.6B model. For a detailed explanation of PreSINQ strategy please refer to the the official SINQ repository.
SINQ is a fast and high-quality quantization technique designed to significantly reduce Large Language Model size while preserving accuracy.
If you find this project useful, please consider giving a β to the official SINQ repository.
Model Details
- Model Name:
Qwen3-0.6B-PreSINQ-GGUF - Base Model:
Qwen/Qwen3-0.6B - Task: Text Generation
- Framework: PyTorch / Transformers
- License: Apache-2.0
- Quantized By: Huawei β Computing Systems Lab
How to Obtain the PreSINQ Model
The PreSINQ Qwen3-0.6B models are produced using the PreSINQ GGUF script available in the official SINQ repository.
The models provided here correspond to the best-performing configurations for each quantization type.
π Best PreSINQ Quantization Results (Qwen3-0.6B)
Results below are measured on the WikiText-2 test set.
| Method | Bits | Size (GB) | Perplexity β |
|---|---|---|---|
| Baseline (FP16) | FP16 | 1.41 | 21.8769 |
| Baseline + Q4_K_S | 4-bit | 0.45 | 24.3443 |
| PreSINQ + Q4_K_S | 4-bit | 0.37 | 22.9176 |
| Baseline + Q3_K_S | 3-bit | 0.37 | 35.5913 |
| PreSINQ + Q3_K_S | 3-bit | 0.31 | 29.1805 |
However, you can generate good PreSINQ models (not the best one) faster by reducing the number of configurations explored during the PreSINQ script execution.
π Usage
Usage Example
You can load and run the PreSINQ GGUF models using:
- π€ Transformers
- llama.cpp
- Any GGUF-compatible inference framework
π§Ύ How to Cite This Work
If you find SINQ useful in your research or applications:
@misc{muller2025sinq,
title={SINQ: Sinkhorn-Normalized Quantization for Calibration-Free Low-Precision LLM Weights},
author={Lorenz K. Muller and Philippe Bich and Jiawei Zhuang and Ahmet Celik and Luca Benfenati and Lukas Cavigelli},
year={2025},
eprint={2509.22944},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={http://arxiv.org/abs/2509.22944}
}
- Downloads last month
- 17
3-bit
4-bit