---
license: other
license_name: ztech-license
license_link: https://huggingface.co/ZirTech/OmniMath-2B/resolve/main/LICENSE
language:
- en
pipeline_tag: text-generation
---

---
# đź§® OmniMath-2B
OmniMath-2B is a compact yet capable mathematical reasoning model, fine‑tuned on top of **Qwen3.5‑2B**'s hybrid architecture (Gated Delta Networks interleaved with standard attention). Trained on **10,000** carefully selected math problems from five diverse open‑source datasets, it excels at step‑by‑step solutions, arithmetic word problems, geometry reasoning, and error recovery.
Despite its small size, OmniMath-2B demonstrates strong chain‑of‑thought performance and is ideally suited for resource‑constrained environments, edge deployment, and fast prototyping.
---
## ✨ Key Features
- **Efficient 2B Scale** : Only 2 billion parameters – runs smoothly on a single T4 GPU or even CPU with quantization.
- **Multi‑Source Math Training** : Balanced mix of real‑world problems (`orca‑math`, `GSM8K`), synthetic reasoning (`MetaMathQA`), geometry (`Geo‑Thought`), and multi‑modal math (`DeepVision` text subset).
- **Step‑by‑Step Reasoning** : Trained with explicit `...`‑style chain‑of‑thought prompts.
- **Hybrid Architecture** : Inherits Qwen3.5's Gated Delta Networks for efficient long‑context processing.
---
## 📊 Benchmarks
*Preliminary results (evaluation ongoing).*
| Model | Size (params) | GSM8K Accuracy |
|-------|---------------|----------------|
| Qwen2.5-Math-1.5B | 1.5B | 54% |
| Phi-2 (0-shot CoT) | 2.7B | 50.0% |
| **OmniMath-2B (0-shot CoT)** | **2B** | **63.76%** |
| dolphin-2_6-phi-2 | 2.7B | 58.07% |
| Qwen2.5-0.5B-Instruct | 2.7B | 49.6% |
| gemma-3-1b-it | 1.1B | 62.8% |
| MobileLLM-R1.5 950M | 1B | 52.8% |
| Gemma 2 2B IT | 2B | 23.9% |
*Updates coming soon.*
---
## 🚀 Quickstart
### 🤗 Transformers
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "ZirTech/OmniMath-2B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype="auto",
device_map="auto"
)
messages = [
{"role": "system", "content": "You are a helpful math assistant. Solve problems step by step."},
{"role": "user", "content": "A store sells apples for $2 each. If you buy 5 apples, how much do you pay?"}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer([text], return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.6, top_p=0.95, top_k=20)
print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True))
```
---
## ⚡ vLLM
```
vllm serve ZirTech/OmniMath-2B --tensor-parallel-size 1 --max-model-len 4096
```
```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "ZirTech/OmniMath-2B"
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True
)
model.eval()
def ask(question):
prompt = f"<|im_start|>system\nYou are a helpful math assistant.<|im_end|>\n<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant\n"
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.0, do_sample=False)
response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True)
if "user" in response:
response = response.split("user")[0].strip()
return response
print(ask("Find the degree for the given field extension Q(sqrt(2), sqrt(3), sqrt(18)) over Q. Give me the answer."))
```
---
## 🏗️ Architecture
OmniMath‑2B fully preserves Qwen3.5‑2B's design:
* Gated Delta Networks : Linear attention layers interleaved with standard attention.
* 262K Native Context : Supports up to 262,144 tokens (extendable with YaRN).
* Built on Qwen3_5ForCausalLM : Seamless integration with Hugging Face ecosystem.
---
## ⚠️ Limitations
* Numerical accuracy may occasionally falter – always double‑check critical calculations.
* Geometry with visual elements was only trained on textual descriptions; performance on image‑based geometry is limited.
* Non‑English math problems are not thoroughly evaluated.
---
## 🙏 Acknowledgments
* Qwen Team for the outstanding Qwen3.5 base models.
* Hugging Face for dataset hosting and the Transformers library.
* Kaggle for providing free GPU hours.
---
## đź“– Citation
```bibtex
@misc{omnimath2b2026,
title={OmniMath-2B: A Lightweight Open Mathematical Reasoning Model},
author={Zirt Techniques},
year={2026},
url={https://huggingface.co/ZirTech/OmniMath-2B}
}
```
---
**Built by [Zirt Tech](https://huggingface.co/ZirTech) ❤️**