🧠 Srikri7/qwen3.5-2b-reasoning

πŸ“’ Release Note: Build Environment Upgrades

  • Fine-tuning Framework: Unsloth 2026.3.7
  • Core Dependencies: Transformers 5.3.0, Torch 2.10.0+cu128
  • Hardware: Optimized for Tesla T4 (16GB VRAM) using 4-bit NormalFloat (NF4) quantization.
  • Native Developer Role: Support for the "developer" role natively to ensure compatibility with modern coding agents (Claude Code, OpenCode).
  • Continuous Thinking: Optimized to run autonomously for over 9 minutes without stalling.

πŸ’‘ Model Introduction

qwen3.5-2b-reasoning is a highly efficient reasoning model fine-tuned on the Qwen3.5-2B architecture. Despite its 2-billion parameter count, it leverages high-density Chain-of-Thought (CoT) distillation primarily sourced from Claude-4.6 Opus trajectories.

The model is specifically trained to avoid the "repetitive loop" failure common in small models by enforcing a strict hierarchy of analytical thought within <think> tags.

🧠 Learned Reasoning Scaffold

The model adopts a streamlined structured thinking pattern to ensure deep analytical capacity without redundant cognitive loops:

<think>
1. [Understanding]: Restate the core objective and identify key numerical constraints (e.g., "252 students", "41-seater bus").
2. [Plan]: Identify necessary strategies or math rules (e.g., Product Rule, Rounding-up logic).
3. [Step-by-step Reasoning]: Execute transformations with intermediate justifications.
4. [Verification]: Cross-check the final result against the initial constraints.
</think>
[Final Answer]
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ 1 Ask for provider support

Model tree for Srikri7/qwen3.5-2b-reasoning

Finetuned
Qwen/Qwen3.5-2B
Finetuned
(109)
this model

Dataset used to train Srikri7/qwen3.5-2b-reasoning