Spaces:
Running
Running
| title: Needle Playground | |
| emoji: 🌵 | |
| colorFrom: green | |
| colorTo: gray | |
| sdk: static | |
| pinned: false | |
| license: mit | |
| short_description: Cactus Needle (26M function-calling) in the browser | |
| models: | |
| - Cactus-Compute/needle | |
| - onnx-community/needle-onnx | |
| # Needle Playground (Browser) | |
| 26M-parameter function-calling model running entirely in your browser — no server, no GPU, just `onnxruntime-web` + WASM. Tap a preset chip, watch JSON appear. | |
| ## Credits | |
| - **Model:** [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle) — the original Simple Attention Network, trained by [Cactus Compute](https://github.com/cactus-compute) on 200B tokens of pre-training and 2B tokens of function-call post-training. | |
| - **ONNX export:** [onnx-community/needle-onnx](https://huggingface.co/onnx-community/needle-onnx) — browser-ready ONNX artifacts produced by a JAX→PyTorch port + `torch.onnx.export`, with byte-identical output parity against the upstream `needle.generate()`. | |
| - **Browser app:** This Space — Vite + TypeScript, `onnxruntime-web` (WASM backend), `sentencepiece-js` for tokenization. Mirrors the layout of the official `needle playground` CLI. | |
| ## How it works | |
| 1. Page load: fetch `encoder.onnx`, `decoder_step.onnx`, `needle.model` from [onnx-community/needle-onnx](https://huggingface.co/onnx-community/needle-onnx) (~140 MB; cached by the browser after first visit). | |
| 2. Click a preset, or type your own query + tool definitions. | |
| 3. On generate: encoder runs once over `[query, <tools>, JSON.stringify(tools)]`, then the decoder runs step-by-step with KV-cache, seeded with `<eos>`, until the model emits `<eos>` again or 256 tokens are reached. | |
| 4. Output is decoded with SentencePiece, stripped of the leading `<tool_call>` marker, parsed as JSON, and pretty-printed. | |
| Greedy argmax sampling, EOS-only stop — matches Cactus's native `generate()` configuration exactly. | |
| ## Source | |
| Full source is available in this Space (browse files above) and the same code as the upstream project repo. Build with `npm install && npm run build`; dev with `npm run dev`. | |
| ## License | |
| MIT (matching the upstream Cactus Needle license). | |