qwen2.5-coder-0.5b-trident-deep-v4.2-gguf

GGUF-quantized versions of Qwen2.5-Coder-0.5B-Instruct fine-tuned (LoRA, PEFT + TRL) on the yuiseki/text2geoql dataset.

This model implements the TRIDENT deep layer: translating AreaWithConcern instructions from TRIDENT intermediate language into executable Overpass QL queries for OpenStreetMap.

Performance

100.0% (112/112) on a held-out eval set of pairs excluded from training and guaranteed to return non-empty Overpass API results
25.8 tok/s on Raspberry Pi 5 (Q4_K_M, llama.cpp, CPU-only) — fully offline TRIDENT deep layer is practical

Files

File	Size	Description
	380 MB	Recommended — fastest on CPU
	507 MB	Higher precision
	949 MB	Full precision

Usage (llama.cpp)

Output:

Inference speed (Raspberry Pi 5, CPU-only)

Quantization	Generation speed	~100-token query
Q4_K_M	25.8 tok/s	~4 sec
Q8_0	19.3 tok/s	~5 sec
F16	11.6 tok/s	~9 sec

Training

Base model: Qwen/Qwen2.5-Coder-0.5B-Instruct
Method: LoRA (PEFT + TRL, no Unsloth), r=16, alpha=32
Dataset: yuiseki/text2geoql (~4,900 pairs)
Training time: ~12 min on NVIDIA RTX 3060 × 2

Source

Dataset & training code: yuiseki/text2geoql-dataset
Research findings: RF-004, RF-009

Downloads last month: 116

GGUF

Model size

0.5B params

Architecture

qwen2

Hardware compatibility

4-bit

8-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for yuiseki/qwen2.5-coder-0.5b-trident-deep-v4.2-gguf

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-Coder-0.5B

Finetuned

Qwen/Qwen2.5-Coder-0.5B-Instruct

Quantized

(64)

this model

yuiseki
/

qwen2.5-coder-0.5b-trident-deep-v4.2-gguf