hmahadik commited on
Commit
6d659ec
Β·
verified Β·
1 Parent(s): 85c8c4c

docs: README updated for v6 (11 tools, Q5_K_M, single-tool 95.5%)

Browse files
Files changed (1) hide show
  1. README.md +21 -18
README.md CHANGED
@@ -20,33 +20,36 @@ inference: false
20
  # FunctionGemma 270M β€” Physical AI
21
 
22
  Fine-tuned [`google/functiongemma-270m-it`](https://huggingface.co/google/functiongemma-270m-it)
23
- for voice-controlled physical-AI / household-IoT actions. 13 callable tools
24
- (lights, neopixel patterns, buzzer, alarms, camera, scene description, system
25
- status, plus a `respond` natural-language fallback for ambiguous or
26
- out-of-scope prompts). Reference deployment: Synaptics SL2619 "Coral" edge
27
- board, Google IO 2026 demo.
28
 
29
- The full 13-tool schema ships with this repo as
30
- [`tools.json`](./tools.json). Token-to-tool mapping is in
31
- [`token_map.json`](./token_map.json).
 
 
 
 
32
 
33
  ## Output format β€” function tokens
34
 
35
- This model emits tool calls as **function tokens**: each tool name is
36
- compiled to a single special-vocabulary token (`<tool_0>` … `<tool_12>`)
37
- and a single `<end>` terminator. A complete call decodes in roughly 8–15
38
- output tokens, vs ~30–80 for native FunctionGemma's
39
  `<start_function_call>call:NAME{...}<end_function_call>` syntax. On a
40
  2-core Cortex-A55 this is the difference between sub-second and 2–5 s
41
  voice-UX latency.
42
 
43
- | File | Sample output | Output tokens |
44
- |------|---------------|---------------|
45
- | `functiongemma-physical-ai-Q4_K_M.gguf` | `<tool_3>(3,"red")<end>` | ~8–15 |
 
 
46
 
47
- The mapping is maintained in [`token_map.json`](./token_map.json) (e.g.
48
- `<tool_0>` β†’ `turn_on_lights`, `<tool_3>` β†’ `blink_lights`, `<tool_12>` β†’
49
- `respond`).
50
 
51
  ## Quick start (Ollama)
52
 
 
20
  # FunctionGemma 270M β€” Physical AI
21
 
22
  Fine-tuned [`google/functiongemma-270m-it`](https://huggingface.co/google/functiongemma-270m-it)
23
+ for voice-controlled physical-AI / household-IoT actions on a Synaptics
24
+ SL2619 "Coral" edge board (Google IO 2026 demo).
 
 
 
25
 
26
+ | Revision | File | Tool count | Notes |
27
+ |----------|------|-----------:|-------|
28
+ | **v6 (current)** | [`functiongemma-physical-ai-v6-Q5_K_M.gguf`](./functiongemma-physical-ai-v6-Q5_K_M.gguf) | 11 | Camera + vision dropped. Single-tool routing **95.5%**, multi-tool exact-match 23.9%. |
29
+ | v4c (legacy) | [`functiongemma-physical-ai-Q4_K_M.gguf`](./functiongemma-physical-ai-Q4_K_M.gguf) | 13 | Earlier checkpoint, includes camera/scene tools. |
30
+
31
+ Schema ships as [`tools.json`](./tools.json) (11 tools, current). Token-to-tool
32
+ mapping is in [`token_map.json`](./token_map.json).
33
 
34
  ## Output format β€” function tokens
35
 
36
+ Tool calls emit as **function tokens**: each tool name compiles to a single
37
+ special-vocabulary token (`<tool_0>` … `<tool_10>` for v6) and a single
38
+ `<end>` terminator. A complete call decodes in roughly 8–15 output tokens,
39
+ vs ~30–80 for native FunctionGemma's
40
  `<start_function_call>call:NAME{...}<end_function_call>` syntax. On a
41
  2-core Cortex-A55 this is the difference between sub-second and 2–5 s
42
  voice-UX latency.
43
 
44
+ Sample output: `<tool_3>(3,"red")<end>` for `blink_lights(count=3, color="red")`.
45
+
46
+ `<tool_0>` β†’ `turn_on_lights`, `<tool_3>` β†’ `blink_lights`,
47
+ `<tool_9>` β†’ `get_system_status`, `<tool_10>` β†’ `respond` (v6 numbering).
48
+ Full mapping in [`token_map.json`](./token_map.json).
49
 
50
+ > ⚠️ Inference servers MUST stop generation on `<end_of_turn>` (or `<eos>`),
51
+ > NOT on `<end>`. Multi-tool sequences emit `<tool_A>(args)<end><tool_B>(args)<end>`,
52
+ > so stopping at the first `<end>` truncates legitimate multi-tool output.
53
 
54
  ## Quick start (Ollama)
55