Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF

This terribly named model was a quick finetune of Qwen3.5-4B on the nohurry/Opus-4.6-Reasoning-3000x-filtered dataset. It tends to have cleaner reasoning traces than the original Qwen3.5-4B, and is around as accurate. I haven't tested it, though. This model was finetuned and converted to GGUF format using Unsloth.

It's a bit inconsistent with reasoning. It's far less likely to enter endless loops, and is uses far fewer tokens than the original model. But it's still a 4B model that's been finetuned on one dataset, so it's not fantastic.

Example usage:

  • For text only LLMs: llama-cli -hf avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF --jinja
  • For multimodal models: llama-mtmd-cli -hf avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF --jinja

Available Model files:

  • Qwen3.5-4B.Q5_K_M.gguf
  • Qwen3.5-4B.Q8_0.gguf
  • Qwen3.5-4B.Q4_K_M.gguf
  • Qwen3.5-4B.BF16-mmproj.gguf This was trained 2x faster with Unsloth
Downloads last month
1,486
GGUF
Model size
4B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF

Finetuned
Qwen/Qwen3.5-4B
Quantized
(145)
this model

Dataset used to train avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF