needle-playground / README.md
shreyask's picture
Upload README.md with huggingface_hub
ee342db verified
---
title: Needle Playground
emoji: 🌵
colorFrom: green
colorTo: gray
sdk: static
pinned: false
license: mit
short_description: Cactus Needle (26M function-calling) in the browser
models:
- Cactus-Compute/needle
- onnx-community/needle-onnx
---
# Needle Playground (Browser)
26M-parameter function-calling model running entirely in your browser — no server, no GPU, just `onnxruntime-web` + WASM. Tap a preset chip, watch JSON appear.
## Credits
- **Model:** [Cactus-Compute/needle](https://huggingface.co/Cactus-Compute/needle) — the original Simple Attention Network, trained by [Cactus Compute](https://github.com/cactus-compute) on 200B tokens of pre-training and 2B tokens of function-call post-training.
- **ONNX export:** [onnx-community/needle-onnx](https://huggingface.co/onnx-community/needle-onnx) — browser-ready ONNX artifacts produced by a JAX→PyTorch port + `torch.onnx.export`, with byte-identical output parity against the upstream `needle.generate()`.
- **Browser app:** This Space — Vite + TypeScript, `onnxruntime-web` (WASM backend), `sentencepiece-js` for tokenization. Mirrors the layout of the official `needle playground` CLI.
## How it works
1. Page load: fetch `encoder.onnx`, `decoder_step.onnx`, `needle.model` from [onnx-community/needle-onnx](https://huggingface.co/onnx-community/needle-onnx) (~140 MB; cached by the browser after first visit).
2. Click a preset, or type your own query + tool definitions.
3. On generate: encoder runs once over `[query, <tools>, JSON.stringify(tools)]`, then the decoder runs step-by-step with KV-cache, seeded with `<eos>`, until the model emits `<eos>` again or 256 tokens are reached.
4. Output is decoded with SentencePiece, stripped of the leading `<tool_call>` marker, parsed as JSON, and pretty-printed.
Greedy argmax sampling, EOS-only stop — matches Cactus's native `generate()` configuration exactly.
## Source
Full source is available in this Space (browse files above) and the same code as the upstream project repo. Build with `npm install && npm run build`; dev with `npm run dev`.
## License
MIT (matching the upstream Cactus Needle license).