OmniMath-2B - GGUF Quantized

This is the official GGUF repository for the OmniMath-2B model.
All quantizations provided here are created and maintained by the original model author.

Original model: ZirTech/OmniMath-2B

Quantizations & File Sizes

Quantization	File Size	Description
Q2_K	969 MB	2-bit, smallest, lowest quality
Q3_K_S	1.02 GB	3-bit, small
Q3_K_M	1.10 GB	3-bit, medium
Q3_K_L	1.16 GB	3-bit, large
IQ4_XS	1.20 GB	4-bit integer with improved accuracy
Q4_K_S	1.21 GB	4-bit, small
Q4_K_M	1.27 GB	4-bit, medium (good balance)
Q5_K_S	1.37 GB	5-bit, small
Q5_K_M	1.41 GB	5-bit, medium
Q6_K	1.56 GB	6-bit, high quality
Q8_0	2.01 GB	8-bit, near‑original quality
F16	3.78 GB	16-bit float (original weights)

Recommended Quantization

Since OmniMath-2B is specialized for mathematical reasoning, accuracy is paramount. Lower bit quantizations (Q2, Q3, Q4) may degrade performance on complex problems.

Recommendation	Quantization	Size	Notes
Best for math (minimal quality loss)	Q8_0 or F16	2.01 GB / 3.78 GB	Use if you have enough RAM/VRAM.
Good trade‑off (recommended)	Q6_K or Q5_K_M	1.56 GB / 1.41 GB	Still high accuracy, much smaller than F16.
Minimum acceptable (for tight memory)	Q4_K_M	1.27 GB	May lose some precision; test before using in production.
Not recommended	Q2_K, Q3_K_*, IQ4_XS	< 1.2 GB	Likely to degrade mathematical reasoning.

Tip: For the best results, use Q8_0 or Q6_K. If you need to save space, Q5_K_M is the lowest we recommend for math.

License

This repository and the quantized files are released under the ztech-license. Please refer to the original model repository for the full license text.

The GGUF format quantizations are provided by the original author. No third-party ownership is claimed.

Built by Zirt Tech ❤️

Downloads last month: 1,101

GGUF

Model size

2B params

Architecture

qwen35

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Collection including ZirTech/OmniMath-2B-GGUF

OmniMath Models

Collection

2 items • Updated 2 days ago