Qwen3.5-35B-A3B-Opus-Reasoning-Distilled-v2-GGUF
Model Description
This is the Q4_K_M GGUF quantized version of ponytang3/Qwen3.5-35B-A3B-Opus-Reasoning-Distilled-v2.
Original Model
- Base Model: unsloth/Qwen3.5-35B-A3B
- Fine-tuned Model: ponytang3/Qwen3.5-35B-A3B-Opus-Reasoning-Distilled-v2
- Quantization: Q4_K_M (4-bit quantization)
Training Details
- Method: bf16 LoRA + response-only (train_on_responses_only)
- LoRA Rank: 16
- Epochs: 2
- Max Sequence Length: 4096
- Framework: Unsloth + TRL
Datasets
nohurry/Opus-4.6-Reasoning-3000x-filteredJackrong/Qwen3.5-reasoning-700xRoman1111111/claude-opus-4.6-10000x
Usage with llama.cpp
./llama-cli -m model-q4_k_m.gguf -p "Your prompt here" -n 512
Format
The model uses <think>...</think> tags for chain-of-thought reasoning.
- Downloads last month
- 168
Hardware compatibility
Log In to add your hardware
4-bit
16-bit
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for ponytang3/Qwen3.5-35B-A3B-Opus-Reasoning-Distilled-v2-GGUF
Base model
Qwen/Qwen3.5-35B-A3B-Base Finetuned
Qwen/Qwen3.5-35B-A3B Finetuned
unsloth/Qwen3.5-35B-A3B