Spaces:

shreyask
/

needle-playground

Running

App Files Files Community

needle-playground / README.md

shreyask

Upload README.md with huggingface_hub

ee342db verified 9 days ago

preview code

raw

history blame contribute delete

2.15 kB

metadata

title: Needle Playground
emoji: 🌵
colorFrom: green
colorTo: gray
sdk: static
pinned: false
license: mit
short_description: Cactus Needle (26M function-calling) in the browser
models:
  - Cactus-Compute/needle
  - onnx-community/needle-onnx

Needle Playground (Browser)

26M-parameter function-calling model running entirely in your browser — no server, no GPU, just onnxruntime-web + WASM. Tap a preset chip, watch JSON appear.

Credits

Model: Cactus-Compute/needle — the original Simple Attention Network, trained by Cactus Compute on 200B tokens of pre-training and 2B tokens of function-call post-training.
ONNX export: onnx-community/needle-onnx — browser-ready ONNX artifacts produced by a JAX→PyTorch port + torch.onnx.export, with byte-identical output parity against the upstream needle.generate().
Browser app: This Space — Vite + TypeScript, onnxruntime-web (WASM backend), sentencepiece-js for tokenization. Mirrors the layout of the official needle playground CLI.

How it works

Page load: fetch encoder.onnx, decoder_step.onnx, needle.model from onnx-community/needle-onnx (~140 MB; cached by the browser after first visit).
Click a preset, or type your own query + tool definitions.
On generate: encoder runs once over [query, <tools>, JSON.stringify(tools)], then the decoder runs step-by-step with KV-cache, seeded with <eos>, until the model emits <eos> again or 256 tokens are reached.
Output is decoded with SentencePiece, stripped of the leading <tool_call> marker, parsed as JSON, and pretty-printed.

Greedy argmax sampling, EOS-only stop — matches Cactus's native generate() configuration exactly.

Source

Full source is available in this Space (browse files above) and the same code as the upstream project repo. Build with npm install && npm run build; dev with npm run dev.

License

MIT (matching the upstream Cactus Needle license).