qwen2.5-coder-0.5b-trident-deep-v4.2-gguf

GGUF-quantized versions of Qwen2.5-Coder-0.5B-Instruct fine-tuned (LoRA, PEFT + TRL) on the yuiseki/text2geoql dataset.

This model implements the TRIDENT deep layer: translating AreaWithConcern instructions from TRIDENT intermediate language into executable Overpass QL queries for OpenStreetMap.

Performance

  • 100.0% (112/112) on a held-out eval set of pairs excluded from training and guaranteed to return non-empty Overpass API results
  • 25.8 tok/s on Raspberry Pi 5 (Q4_K_M, llama.cpp, CPU-only) — fully offline TRIDENT deep layer is practical

Files

File Size Description
380 MB Recommended — fastest on CPU
507 MB Higher precision
949 MB Full precision

Usage (llama.cpp)

Output:

Inference speed (Raspberry Pi 5, CPU-only)

Quantization Generation speed ~100-token query
Q4_K_M 25.8 tok/s ~4 sec
Q8_0 19.3 tok/s ~5 sec
F16 11.6 tok/s ~9 sec

Training

  • Base model: Qwen/Qwen2.5-Coder-0.5B-Instruct
  • Method: LoRA (PEFT + TRL, no Unsloth), r=16, alpha=32
  • Dataset: yuiseki/text2geoql (~4,900 pairs)
  • Training time: ~12 min on NVIDIA RTX 3060 × 2

Source

Downloads last month
116
GGUF
Model size
0.5B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yuiseki/qwen2.5-coder-0.5b-trident-deep-v4.2-gguf

Quantized
(64)
this model

Dataset used to train yuiseki/qwen2.5-coder-0.5b-trident-deep-v4.2-gguf