🌟 Qwen3.5-0.8B-Claude-4.6-Opus-Reasoning-Distilled

📢 Release Note Build Environment:

Training Method: Supervised Fine-Tuning (SFT)

Base Model: Qwen3.5-0.8B

Training Libraries: Hugging Face Transformers + TRL

💡 Model Introduction

Qwen3.5-0.8B-Claude-4.6-Opus-Reasoning-Distilled is a reasoning-focused model fine-tuned from Qwen3.5-0.8B using structured reasoning traces derived from Claude-4.6 Opus style reasoning datasets.

The model is trained to produce structured chain-of-thought reasoning, enabling it to:

Break complex problems into logical steps
Produce internal reasoning inside <think> blocks
Deliver accurate final answers after reasoning

The training dataset contains curated reasoning examples designed to teach the model step-by-step analytical thinking.

🚀 How to Run the Model

You can run the model using the transformers library.

1️⃣ Install Dependencies

pip install transformers torch accelerate

2️⃣ Run the Model

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "Ishant06/Qwen3.5-0.8B-Claude-4.6-Opus-Reasoning-Distilled"

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

# Load model
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)

prompt = "Explain why the sky is blue."

messages = [
    {"role": "system", "content": "You are a helpful AI assistant."},
    {"role": "user", "content": prompt}
]

# Apply chat template (important for Qwen models)
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True
)

inputs = tokenizer(text, return_tensors="pt").to(model.device)

with torch.no_grad():
    outputs = model.generate(
        **inputs,
        max_new_tokens=512,
        temperature=0.7,
        top_p=0.9
    )

response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(response)

Tested with:

transformers >= 4.40
torch >= 2.0

🧠 Example Reasoning Pattern

The model follows a structured reasoning scaffold such as:

Let me analyze this request carefully:

1. Understand the problem.
2. Break it into smaller steps.
3. Analyze each step logically.
4. Combine the reasoning.
5. Produce the final answer.

🗺️ Training Pipeline Overview

Base Model (Qwen3.5-0.8B)
 │
 ▼
Supervised Fine-Tuning (SFT)
 │
 ▼
Final Model (Claude-4.6-Opus-Reasoning-Distilled)

📋 Training Details

🔹 Supervised Fine-Tuning (SFT)

Framework: Hugging Face Transformers + TRL

Training Strategy: Instruction → Response SFT

Goal: Teach the model structured reasoning and step-by-step problem solving.

Format Used During Training:

<think>
internal reasoning
</think>
final answer

📚 Dataset Used

Dataset	Description
crownelius/Opus-4.6-Reasoning-3300x	Claude-4.6 Opus style reasoning dataset containing structured chain-of-thought examples

🌟 Capabilities

The model performs well in tasks requiring reasoning such as:

Logical problem solving
Mathematical reasoning
Coding explanations
Step-by-step analysis
Instruction following

⚠️ Limitations

The model may still hallucinate factual information.
Performance is limited by the relatively small 0.8B parameter size.
Best suited for experimentation, lightweight reasoning tasks, and research.

🙏 Acknowledgements

Qwen Team for the base model.
The open-source community for providing reasoning datasets.

📖 Citation

@misc{ishant_qwen35_opus_reasoning,
  title        = {Qwen3.5-0.8B-Claude-4.6-Opus-Reasoning-Distilled},
  author       = {Ishant Dere},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Ishant06/Qwen3.5-0.8B-Claude-4.6-Opus-Reasoning-Distilled}}
}

Downloads last month: 81

Safetensors

Model size

0.8B params

Tensor type

F16

Model tree for Ishant06/Qwen3.5-0.8B-Claude-4.6-Opus-Reasoning-Distilled

Base model

Qwen/Qwen3.5-0.8B-Base

Finetuned

Qwen/Qwen3.5-0.8B