Harmonic-2B-GGUF

Harmonic-2B

GGUF quantizations of Harmonic-2B - a reasoning-focused draft model fine-tuned from Qwen 3.5 2B, designed for speculative decoding with the larger Harmonic models.

Support This Work

I'm a PhD student in visual neuroscience at the University of Toronto who also happens to spend way too much time fine-tuning, merging, and quantizing open-weight models on rented H100s and a local DGX Spark. All training compute is self-funded — balancing GPU costs against a student budget. If my uploads have been useful to you, consider buying a PhD student a coffee. It goes a long way toward keeping these experiments running.

Support on Ko-fi


Available Quantizations

Quantization Size Use Case
F16 3.6 GB Full precision, no quality loss
Q8_0 1.9 GB Near-lossless, recommended for most users
Q4_K_M 1.2 GB Smallest, great for edge/mobile

Recommended Quantization

Q8_0 for most users - near-lossless quality, runs on basically anything.

Q4_K_M for maximum speed as a speculative decoding draft model.

Usage

Ollama

echo 'FROM ./Harmonic-2B-Q8_0.gguf' > Modelfile
ollama create harmonic-2b -f Modelfile
ollama run harmonic-2b

llama.cpp

./llama-cli -m Harmonic-2B-Q8_0.gguf -p "Solve this step by step:" -n 512

LM Studio

Download the GGUF file and load it directly in LM Studio.

Speculative Decoding

This model is built as a draft model for Harmonic-27B. Both share the same training data and reasoning patterns for high acceptance rates.

Model Details

  • Base: Qwen 3.5 2B (2.27B parameters)
  • Training: 799 curated reasoning rows, LoRA fine-tuned
  • Context: 8192 tokens
  • Reasoning format: Uses think blocks for structured reasoning

See the full model card for training details and data quality metrics.

License

Apache 2.0

Links

Downloads last month
1,085
GGUF
Model size
2B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for DJLougen/Harmonic-2B-GGUF

Finetuned
Qwen/Qwen3.5-2B
Quantized
(4)
this model