Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF
This terribly named model was a quick finetune of Qwen3.5-4B on the nohurry/Opus-4.6-Reasoning-3000x-filtered dataset. It tends to have cleaner reasoning traces than the original Qwen3.5-4B, and is around as accurate. I haven't tested it, though. This model was finetuned and converted to GGUF format using Unsloth.
It's a bit inconsistent with reasoning. It's far less likely to enter endless loops, and is uses far fewer tokens than the original model. But it's still a 4B model that's been finetuned on one dataset, so it's not fantastic.
Example usage:
- For text only LLMs:
llama-cli -hf avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF --jinja - For multimodal models:
llama-mtmd-cli -hf avalon2244/Qwen3.5-4B-Claude-Opus-4.6-Distilled-GGUF --jinja
Available Model files:
Qwen3.5-4B.Q5_K_M.ggufQwen3.5-4B.Q8_0.ggufQwen3.5-4B.Q4_K_M.ggufQwen3.5-4B.BF16-mmproj.ggufThis was trained 2x faster with Unsloth
- Downloads last month
- 1,486
Hardware compatibility
Log In to add your hardware
4-bit
5-bit
8-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support