h-d-h commited on
Commit
72fb11f
·
verified ·
1 Parent(s): 13a2c0a

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +51 -0
README.md ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: unsloth/Qwen2.5-7B-Instruct
3
+ library_name: gguf
4
+ license: apache-2.0
5
+ tags: [gguf, tool-use, radicle, git, qwen2, qlora]
6
+ datasets: [h-d-h/rad-model-dataset]
7
+ ---
8
+
9
+ # rad-model
10
+
11
+ QLoRA fine-tune of [Qwen 2.5 7B Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) for Radicle and Git tool calling. Distributed as a GGUF file for local inference with [llama.cpp](https://github.com/ggerganov/llama.cpp).
12
+
13
+ ## Intended use
14
+
15
+ Tool-calling backend for CLI assistants that work with [Radicle](https://radicle.xyz) (decentralized code collaboration) and Git. The model selects and parameterizes the right CLI tool given a natural language request.
16
+
17
+ ## Training
18
+
19
+ - **Method**: QLoRA (r=32, alpha=64) via [Unsloth](https://github.com/unslothai/unsloth)
20
+ - **Dataset**: [h-d-h/rad-model-dataset](https://huggingface.co/datasets/h-d-h/rad-model-dataset) — ~870 synthetic tool-calling examples covering 89 tools
21
+ - **Hardware**: NVIDIA RTX 3090 (24 GB VRAM)
22
+ - **Quantization**: Q4_K_M (~4.5 GB)
23
+
24
+ ## Serving
25
+
26
+ ```bash
27
+ llama-server -m rad-model-run6-q4_k_m.gguf --port 8080 -ngl 99 --host 0.0.0.0
28
+ ```
29
+
30
+ The model serves an OpenAI-compatible `/v1/chat/completions` endpoint with tool-calling support.
31
+
32
+ ## Evaluation
33
+
34
+ Evaluated on 88 held-out examples (stratified across all 89 tools). Scoring: 1.0 = correct tool + arguments, 0.75 = correct tool + extra args, 0.5 = correct tool + wrong args, 0.0 = wrong tool or no tool call.
35
+
36
+ See [RESULTS.md](https://app.radicle.xyz/nodes/rosa.radicle.xyz/rad:z2YCwgkXrZkUTu8c4CQayvk9Pkpky/tree/RESULTS.md) for full experiment history.
37
+
38
+ ## Limitations
39
+
40
+ - Trained on synthetic data only; may not handle ambiguous real-world requests well
41
+ - Tool descriptions heavily influence accuracy — the base model with good descriptions can outperform the fine-tune on some tasks
42
+ - English only
43
+ - Designed for single-turn or short multi-turn tool-calling; not a general chat model
44
+
45
+ ## Source
46
+
47
+ Developed on [Radicle](https://radicle.xyz): `rad:z2YCwgkXrZkUTu8c4CQayvk9Pkpky`
48
+
49
+ ## License
50
+
51
+ Apache-2.0