Spaces:

shreyask
/

jina-omni-webgpu

Running

App Files Files Community

shreyask commited on about 20 hours ago

Commit

6cdd82e

verified ·

1 Parent(s): 1bc1978

Space README + frontmatter

Browse files

Files changed (1) hide show

README.md +38 -5

README.md CHANGED Viewed

@@ -1,10 +1,43 @@
 ---
-title: Jina Omni Webgpu
-emoji: 👀
-colorFrom: yellow
-colorTo: red
 sdk: static
 pinned: false
 ---
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

 ---
+title: Jina v5 Omni WebGPU
+emoji: "\U0001F3B5"
+colorFrom: indigo
+colorTo: blue
 sdk: static
 pinned: false
+license: cc-by-nc-4.0
+short_description: Cross-modal search on WebGPU with jina-embeddings-v5-omni
+models:
+- jinaai/jina-embeddings-v5-omni-nano
+- shreyask/jina-embeddings-v5-omni-nano-ONNX
+tags:
+- multimodal
+- cross-modal-retrieval
+- webgpu
+- transformers.js
+- onnx
+- jina-embeddings
 ---
+# jina · omni · webgpu
+In-browser cross-modal search powered by [`jinaai/jina-embeddings-v5-omni-nano`](https://huggingface.co/jinaai/jina-embeddings-v5-omni-nano) running entirely on WebGPU via [transformers.js](https://huggingface.co/docs/transformers.js) v4 + ONNX Runtime Web.
+One vector space for **text, images, audio, and (eventually) video** — a text query ranks image/audio corpus items and vice-versa without re-indexing.
+## What's inside
+- **Model load gate** with a precision selector (`q4f16` default — 2.14 GB, or `fp16` for cleaner numerics at 2.68 GB). All three task-specific ONNX graphs (text / vision / audio) download in parallel with per-bundle progress bars; cached in your browser for subsequent loads.
+- **Curated query chips** to demo cross-modal retrieval against the seeded corpus.
+- **Result cards** render the actual asset — image thumbnails inline, audio with a ▶/❚❚ inline play toggle. When audio rarely cracks top-K (v5-omni's text→audio alignment is weaker than text→image), the closest audio match is appended after the top-K with an explainer.
+- **Corpus editor** — clear the seeded 25-item corpus and add your own text / image / audio. Embedding runs in-browser via the same ONNX session the query uses.
+## Assets + attribution
+The seeded corpus mixes 12 text snippets, 10 instrument photos, and 3 audio clips — all images and audio are sourced from [Wikimedia Commons](https://commons.wikimedia.org/) with full inline artist + license + source attribution. Model license is **CC BY-NC 4.0** (inherited from the base model — non-commercial use only; contact `sales@jina.ai` for commercial).
+## Links
+- Base model: [jinaai/jina-embeddings-v5-omni-nano](https://huggingface.co/jinaai/jina-embeddings-v5-omni-nano)
+- Blog: [Jina embeddings v5 omni](https://jina.ai/news/jina-embeddings-v5-omni-multimodal-embeddings-for-text-image-audio-and-video/)
+- ONNX weights: [shreyask/jina-embeddings-v5-omni-nano-ONNX](https://huggingface.co/shreyask/jina-embeddings-v5-omni-nano-ONNX)