Sn-Logicer-0.8B

A fine-tune of Qwen/Qwen3.5-0.8B optimized for grade-school math reasoning, trained on ~7k synthetic math word problems generated by DeepSeek v3.2.

No GSM8K data was used for training. GSM8K is used solely as a held-out evaluation benchmark.

Results

Evaluated with lm-eval-harness (gsm8k_cot_llama, 8-shot CoT):

Model Flexible Extract Strict Match
Qwen3.5-0.8B (base) 48.45% 47.69%
Sn-Logicer-0.8B 50.57% 50.42%
+2.12 +2.73

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("SnurfyAI/Sn-Logicer-0.8B", trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("SnurfyAI/Sn-Logicer-0.8B", trust_remote_code=True)

messages = [
    {"role": "user", "content": "A store sells 3 shirts at $15 each and 2 pants at $25 each. If a customer buys all of them with a 10% discount, how much do they pay?"}
]

prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
output = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(output[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Training Details

Dataset

  • 7,077 synthetic math word problems generated via DeepSeek v3.2 through OpenRouter
  • Problems cover arithmetic, fractions, percentages, rates, money, time/distance, geometry, combinatorics, and unit conversions
  • Each example includes step-by-step reasoning ending with #### <answer>
  • Training data is entirely synthetic — no existing math benchmarks were used

Hyperparameters

Parameter Value
Epochs 3
Batch size (effective) 16 (4 x 4 grad accum)
Learning rate 2e-5
LR scheduler Cosine
Warmup steps 50
Weight decay 0.01
Max sequence length 512
Precision bfloat16
Optimizer AdamW
Gradient checkpointing Enabled

Infrastructure

  • Hardware: NVIDIA RTX 5090 (32GB)
  • Training time: ~3 hours
  • Data generation: DeepSeek v3.2 via OpenRouter API

Framework Versions

  • TRL: 0.29.0
  • Transformers: 5.3.0
  • PyTorch: 2.10.0+cu130
  • Datasets: 4.8.3
  • Tokenizers: 0.22.2

Evaluation Command

lm_eval --model hf \
  --model_args pretrained=SnurfyAI/Sn-Logicer-0.8B,trust_remote_code=True \
  --tasks gsm8k_cot_llama \
  --num_fewshot 8 \
  --apply_chat_template \
  --fewshot_as_multiturn \
  --batch_size auto

Limitations

  • Trained only on synthetic grade-school math — may not generalize to advanced mathematics
  • The +2.1% improvement over base is modest; more/higher-quality training data would likely yield larger gains
  • Inherits all limitations of the base Qwen3.5-0.8B model

Citations

Cite Qwen3.5 as:

@misc{qwen3.5,
    title  = {{Qwen3.5}: Towards Native Multimodal Agents},
    author = {{Qwen Team}},
    month  = {February},
    year   = {2026},
    url    = {https://qwen.ai/blog?id=qwen3.5}
}

Cite TRL as:

@software{vonwerra2020trl,
  title   = {{TRL: Transformers Reinforcement Learning}},
  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
  license = {Apache-2.0},
  url     = {https://github.com/huggingface/trl},
  year    = {2020}
}
@misc{snurfyai2026snlogicer,
  title  = {Sn-Logicer-0.8B: Math Reasoning Fine-tune of Qwen3.5-0.8B},
  author = {SnurfyAI},
  year   = {2026},
  url    = {https://huggingface.co/SnurfyAI/Sn-Logicer-0.8B}
}
Downloads last month
40
Safetensors
Model size
0.8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for SnurfyAI/Sn-Logicer-0.8B

Finetuned
(158)
this model
Quantizations
1 model

Evaluation results