hmunachii commited on
Commit
4362dff
Β·
verified Β·
1 Parent(s): beae308

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +5 -9
README.md CHANGED
@@ -13,9 +13,9 @@ tags:
13
 
14
  # Needle
15
 
16
- A 26M parameter encoder-decoder transformer for on-device function calling, built on a "Simple Attention Network" architecture (no feedforward layers).
17
-
18
- Distilled from Gemini 3.1 Flash Lite. Runs at 6000 tok/s prefill and 1200 tok/s decode on [Cactus](https://github.com/cactus-compute/cactus).
19
 
20
  | | |
21
  |---|---|
@@ -76,10 +76,6 @@ d=512, 8H/4KV, BPE=8192
76
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
77
  ```
78
 
79
- No feedforward layers. Each encoder block is gated self-attention; each decoder block is gated self-attention + gated cross-attention. The only nonlinearities are softmax and sigmoid.
80
-
81
- See [Simple Attention Networks](https://github.com/cactus-compute/needle/blob/main/docs/simple_attention_networks.md) for the full architectural breakdown.
82
-
83
  ## Quickstart
84
 
85
  ```bash
@@ -119,8 +115,8 @@ Finetune on your own tools via the web UI or CLI:
119
  # Web UI (generates data via Gemini, trains, evaluates, bundles result)
120
  needle ui
121
 
122
- # CLI
123
- python -m src.training.finetune data.jsonl --checkpoint checkpoints/needle.pkl
124
  ```
125
 
126
  ## Links
 
13
 
14
  # Needle
15
 
16
+ We distilled Gemini 3.1 into a 26m parameter "[Simple Attention Network](docs/simple_attention_networks.md)" that you can even finetune locally on your Mac/PC.
17
+ In production, Needle runs on [Cactus](https://github.com/cactus-compute/cactus) at 6000 toks/sec prefill and 1200 decode speed.
18
+ Weights are fully open on [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle), as well as the dataset generation.
19
 
20
  | | |
21
  |---|---|
 
76
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
77
  ```
78
 
 
 
 
 
79
  ## Quickstart
80
 
81
  ```bash
 
115
  # Web UI (generates data via Gemini, trains, evaluates, bundles result)
116
  needle ui
117
 
118
+ # CLI (auto-downloads weights if not local)
119
+ python -m src.training.finetune data.jsonl
120
  ```
121
 
122
  ## Links