OmniMath-2B - GGUF Quantized
This is the official GGUF repository for the OmniMath-2B model.
All quantizations provided here are created and maintained by the original model author.
Original model: ZirTech/OmniMath-2B
Quantizations & File Sizes
| Quantization | File Size | Description |
|---|---|---|
| Q2_K | 969 MB | 2-bit, smallest, lowest quality |
| Q3_K_S | 1.02 GB | 3-bit, small |
| Q3_K_M | 1.10 GB | 3-bit, medium |
| Q3_K_L | 1.16 GB | 3-bit, large |
| IQ4_XS | 1.20 GB | 4-bit integer with improved accuracy |
| Q4_K_S | 1.21 GB | 4-bit, small |
| Q4_K_M | 1.27 GB | 4-bit, medium (good balance) |
| Q5_K_S | 1.37 GB | 5-bit, small |
| Q5_K_M | 1.41 GB | 5-bit, medium |
| Q6_K | 1.56 GB | 6-bit, high quality |
| Q8_0 | 2.01 GB | 8-bit, near‑original quality |
| F16 | 3.78 GB | 16-bit float (original weights) |
Recommended Quantization
Since OmniMath-2B is specialized for mathematical reasoning, accuracy is paramount. Lower bit quantizations (Q2, Q3, Q4) may degrade performance on complex problems.
| Recommendation | Quantization | Size | Notes |
|---|---|---|---|
| Best for math (minimal quality loss) | Q8_0 or F16 | 2.01 GB / 3.78 GB | Use if you have enough RAM/VRAM. |
| Good trade‑off (recommended) | Q6_K or Q5_K_M | 1.56 GB / 1.41 GB | Still high accuracy, much smaller than F16. |
| Minimum acceptable (for tight memory) | Q4_K_M | 1.27 GB | May lose some precision; test before using in production. |
| Not recommended | Q2_K, Q3_K_*, IQ4_XS | < 1.2 GB | Likely to degrade mathematical reasoning. |
Tip: For the best results, use Q8_0 or Q6_K. If you need to save space, Q5_K_M is the lowest we recommend for math.
License
This repository and the quantized files are released under the ztech-license. Please refer to the original model repository for the full license text.
The GGUF format quantizations are provided by the original author. No third-party ownership is claimed.
Built by Zirt Tech ❤️
- Downloads last month
- 1,101
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit