Qwen3-0.6B-Reasoning-Opus
This is a fine-tuned version of Qwen3-0.6B optimized for multi-step reasoning. It was trained using QLoRA on a filtered dataset of reasoning traces distilled from Claude 4.6 Opus. The goal of this project was to induce "System 2" (deliberate) thinking in a sub-1B parameter model.
Model Details
- Developed by: Shreyansh Pathak
- Institution: Dayananda Sagar College of Engineering (DSCE), Bangalore
- Model type: Causal Language Model with Chain-of-Thought (CoT) capabilities.
- Base Model: Qwen/Qwen3-0.6B
- Language(s): English
- Fine-tuning Technique: QLoRA (Unsloth)
- Rank (r): 16
- Alpha: 16
Performance & Evaluation
The model was evaluated against the base Qwen3-0.6B model on a 50-sample random subset of the GSM8K (Grade School Math) benchmark to measure logical consistency and arithmetic accuracy.
| Model | GSM8K Accuracy (n=50) | Improvement |
|---|---|---|
| Base Qwen3-0.6B | 26.0% | Baseline |
| Qwen3-0.6B-Reasoning-Opus | 32.0% | +6.0% (Absolute) |
Key Findings
- Reasoning Activation: The fine-tuned model successfully triggers a
<think>block for complex queries, whereas the base model typically provides direct, often incorrect, answers. - Alignment Tax: While math accuracy increased, the model exhibits some "overthinking" on simple logic riddles, a common trade-off in small-parameter reasoning models.
- Relative Gain: The model showed a 23% relative improvement in math problem-solving compared to its pre-trained state.
Training Procedure
- Hardware: NVIDIA Tesla T4 (via Google Colab)
- Optimizer: AdamW (8-bit)
- Learning Rate: 2e-4
- Batch Size: 2 (Gradient Accumulation: 4)
- Training Steps: 60
- Dataset:
nohurry/Opus-4.6-Reasoning-3000x-filtered
Usage
This model uses the standard Qwen3 chat template but is optimized to generate reasoning traces inside <think> tags.
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Shreyansh327/Qwen3-0.6B-Reasoning-Opus"
model = AutoModelForCausalLM.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "If I have 10 oranges and give 3 to John and 2 to Mary, how many are left?"
messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(messages, add_generation_prompt=True, return_tensors="pt")
outputs = model.generate(inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0]))
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support