Phonsiri/Qwen2.5-3B-SFT-Reasoning

This model is a Supervised Fine-Tuned (SFT) version of the original Qwen2.5-3B base model. It has been aggressively trained to develop robust step-by-step reasoning capabilities, conversational intelligence, and structured formatting (using <think> tags) prior to any reinforcement learning or distillation phases.

Despite its compact 3 Billion parameters, this model punches above its weight class in logic puzzles, conversational alignment, and complex instruction-following capabilities on consumer hardware.

Model Details

Base Model: Qwen/Qwen2.5-3B
Training Method: Full Supervised Fine-Tuning (SFT) using SFTTrainer
Parameters: 3 Billion
Languages: English, Thai
Precision: bfloat16
Context Window during Training: 8192 Tokens (To accommodate long reasoning traces)

Dataset Used

The model was fine-tuned exclusively on a dense, high-quality reasoning dataset to imprint deep logic structures directly into its weights:

nohurry/Opus-4.6-Reasoning-3000x-filtered (A highly filtered dataset featuring intricate reasoning chains natively formatted in ChatML)

SFT Configuration (Hyperparameters)

The fine-tuning process was configured to ensure high memory efficiency without compromising gradient quality:

Optimizer: AdamW
Learning Rate: 2.0e-5 (Cosine Scheduler)
Epochs: 3
Batch Size: 4 per device (Effective Batch Size = 16 via Gradient Accumulation)
Max Sequence Length: 8192 tokens
Gradient Checkpointing: Enabled (for VRAM efficiency)

Usage Example

You can run this model directly using the Hugging Face transformers library. The model responds exceptionally well using the standard ChatML prompt format.

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer

model_id = "Phonsiri/Qwen2.5-3B-SFT-Reasoning"

# Load the SFT model
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16,
    trust_remote_code=True
)

# ChatML Format
messages = [
    {"role": "system", "content": "You are Qwen, a brilliant and helpful reasoning assistant."},
    {"role": "user", "content": "A father is 45 years old and his son is 15. In how many years will the father be exactly twice as old as his son?"}
]

text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

model_inputs = tokenizer([text], return_tensors="pt").to(model.device)

# Provide real-time streaming to watch the reasoning process unfold
streamer = TextStreamer(tokenizer, skip_prompt=True, skip_special_tokens=False)

# Terminate cleanly on <|im_end|>
terminators = [
    tokenizer.eos_token_id,
    tokenizer.convert_tokens_to_ids("<|im_end|>")
]

_ = model.generate(
    **model_inputs,
    streamer=streamer,
    max_new_tokens=4096,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    eos_token_id=terminators
)

Downloads last month: 10

Safetensors

Model size

3B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Phonsiri/Qwen2.5-3B-SFT-opus-4.6-reasoning

Base model

Qwen/Qwen2.5-3B

Finetuned

(369)

this model