--- license: other license_name: ztech-license license_link: https://huggingface.co/ZirTech/OmniMath-2B/resolve/main/LICENSE language: - en pipeline_tag: text-generation ---
![image](https://cdn-uploads.huggingface.co/production/uploads/69b3129f02a20db8381db62e/HnRi-9kPNTvDJEO71OO82.png)
--- # 🧮 OmniMath-2B OmniMath-2B is a compact yet capable mathematical reasoning model, fine‑tuned on top of **Qwen3.5‑2B**'s hybrid architecture (Gated Delta Networks interleaved with standard attention). Trained on **10,000** carefully selected math problems from five diverse open‑source datasets, it excels at step‑by‑step solutions, arithmetic word problems, geometry reasoning, and error recovery. Despite its small size, OmniMath-2B demonstrates strong chain‑of‑thought performance and is ideally suited for resource‑constrained environments, edge deployment, and fast prototyping. --- ## ✨ Key Features - **Efficient 2B Scale** : Only 2 billion parameters – runs smoothly on a single T4 GPU or even CPU with quantization. - **Multi‑Source Math Training** : Balanced mix of real‑world problems (`orca‑math`, `GSM8K`), synthetic reasoning (`MetaMathQA`), geometry (`Geo‑Thought`), and multi‑modal math (`DeepVision` text subset). - **Step‑by‑Step Reasoning** : Trained with explicit `...`‑style chain‑of‑thought prompts. - **Hybrid Architecture** : Inherits Qwen3.5's Gated Delta Networks for efficient long‑context processing. --- ## 📊 Benchmarks *Preliminary results (evaluation ongoing).* | Model | Size (params) | GSM8K Accuracy | |-------|---------------|----------------| | Qwen2.5-Math-1.5B | 1.5B | 54% | | Phi-2 (0-shot CoT) | 2.7B | 50.0% | | **OmniMath-2B (0-shot CoT)** | **2B** | **63.76%** | | dolphin-2_6-phi-2 | 2.7B | 58.07% | | Qwen2.5-0.5B-Instruct | 2.7B | 49.6% | | gemma-3-1b-it | 1.1B | 62.8% | | MobileLLM-R1.5 950M | 1B | 52.8% | | Gemma 2 2B IT | 2B | 23.9% | *Updates coming soon.* --- ## 🚀 Quickstart ### 🤗 Transformers ```python from transformers import AutoModelForCausalLM, AutoTokenizer model_id = "ZirTech/OmniMath-2B" tokenizer = AutoTokenizer.from_pretrained(model_id) model = AutoModelForCausalLM.from_pretrained( model_id, torch_dtype="auto", device_map="auto" ) messages = [ {"role": "system", "content": "You are a helpful math assistant. Solve problems step by step."}, {"role": "user", "content": "A store sells apples for $2 each. If you buy 5 apples, how much do you pay?"} ] text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) inputs = tokenizer([text], return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=256, temperature=0.6, top_p=0.95, top_k=20) print(tokenizer.decode(outputs[0][inputs.input_ids.shape[-1]:], skip_special_tokens=True)) ``` --- ## ⚡ vLLM ``` vllm serve ZirTech/OmniMath-2B --tensor-parallel-size 1 --max-model-len 4096 ``` ```python import torch from transformers import AutoModelForCausalLM, AutoTokenizer model_name = "ZirTech/OmniMath-2B" tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype=torch.float16, device_map="auto", trust_remote_code=True ) model.eval() def ask(question): prompt = f"<|im_start|>system\nYou are a helpful math assistant.<|im_end|>\n<|im_start|>user\n{question}<|im_end|>\n<|im_start|>assistant\n" inputs = tokenizer(prompt, return_tensors="pt").to("cuda") outputs = model.generate(**inputs, max_new_tokens=512, temperature=0.0, do_sample=False) response = tokenizer.decode(outputs[0][inputs.input_ids.shape[1]:], skip_special_tokens=True) if "user" in response: response = response.split("user")[0].strip() return response print(ask("Find the degree for the given field extension Q(sqrt(2), sqrt(3), sqrt(18)) over Q. Give me the answer.")) ``` --- ## 🏗️ Architecture OmniMath‑2B fully preserves Qwen3.5‑2B's design: * Gated Delta Networks : Linear attention layers interleaved with standard attention. * 262K Native Context : Supports up to 262,144 tokens (extendable with YaRN). * Built on Qwen3_5ForCausalLM : Seamless integration with Hugging Face ecosystem. --- ## ⚠️ Limitations * Numerical accuracy may occasionally falter – always double‑check critical calculations. * Geometry with visual elements was only trained on textual descriptions; performance on image‑based geometry is limited. * Non‑English math problems are not thoroughly evaluated. --- ## 🙏 Acknowledgments * Qwen Team for the outstanding Qwen3.5 base models. * Hugging Face for dataset hosting and the Transformers library. * Kaggle for providing free GPU hours. --- ## 📖 Citation ```bibtex @misc{omnimath2b2026, title={OmniMath-2B: A Lightweight Open Mathematical Reasoning Model}, author={Zirt Techniques}, year={2026}, url={https://huggingface.co/ZirTech/OmniMath-2B} } ``` ---
**Built by [Zirt Tech](https://huggingface.co/ZirTech) ❤️**