Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF

This terribly named model was a quick finetune of Qwen3.5-4B on the nohurry/Opus-4.6-Reasoning-3000x-filtered dataset. It tends to have cleaner reasoning traces than the original Qwen3.5-4B, and is around as accurate. I haven't tested it, though. This model was finetuned and converted to GGUF format using Unsloth.

It's a bit inconsistent with reasoning. It's far less likely to enter endless loops, and is uses far fewer tokens than the original model. But it's still a 4B model that's been finetuned on one dataset, so it's not fantastic.

Example usage:

For text only LLMs: llama-cli -hf avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF --jinja
For multimodal models: llama-mtmd-cli -hf avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF --jinja

Available Model files:

Qwen3.5-4B.Q5_K_M.gguf
Qwen3.5-4B.Q8_0.gguf
Qwen3.5-4B.Q4_K_M.gguf
Qwen3.5-4B.BF16-mmproj.gguf This was trained 2x faster with Unsloth

Downloads last month: 1,486

GGUF

Model size

4B params

Architecture

qwen35

Hardware compatibility

4-bit

5-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF

Base model

Qwen/Qwen3.5-4B-Base

Finetuned

Qwen/Qwen3.5-4B

Quantized

(145)

this model

avalon2244
/

Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF

Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF

Available Model files:

Model tree for avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF

Dataset used to train avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF