Spaces:
Running
Running
Space README + frontmatter
Browse files
README.md
CHANGED
|
@@ -1,10 +1,43 @@
|
|
| 1 |
---
|
| 2 |
-
title: Jina Omni
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: Jina v5 Omni WebGPU
|
| 3 |
+
emoji: "\U0001F3B5"
|
| 4 |
+
colorFrom: indigo
|
| 5 |
+
colorTo: blue
|
| 6 |
sdk: static
|
| 7 |
pinned: false
|
| 8 |
+
license: cc-by-nc-4.0
|
| 9 |
+
short_description: Cross-modal search on WebGPU with jina-embeddings-v5-omni
|
| 10 |
+
models:
|
| 11 |
+
- jinaai/jina-embeddings-v5-omni-nano
|
| 12 |
+
- shreyask/jina-embeddings-v5-omni-nano-ONNX
|
| 13 |
+
tags:
|
| 14 |
+
- multimodal
|
| 15 |
+
- cross-modal-retrieval
|
| 16 |
+
- webgpu
|
| 17 |
+
- transformers.js
|
| 18 |
+
- onnx
|
| 19 |
+
- jina-embeddings
|
| 20 |
---
|
| 21 |
|
| 22 |
+
# jina · omni · webgpu
|
| 23 |
+
|
| 24 |
+
In-browser cross-modal search powered by [`jinaai/jina-embeddings-v5-omni-nano`](https://huggingface.co/jinaai/jina-embeddings-v5-omni-nano) running entirely on WebGPU via [transformers.js](https://huggingface.co/docs/transformers.js) v4 + ONNX Runtime Web.
|
| 25 |
+
|
| 26 |
+
One vector space for **text, images, audio, and (eventually) video** — a text query ranks image/audio corpus items and vice-versa without re-indexing.
|
| 27 |
+
|
| 28 |
+
## What's inside
|
| 29 |
+
|
| 30 |
+
- **Model load gate** with a precision selector (`q4f16` default — 2.14 GB, or `fp16` for cleaner numerics at 2.68 GB). All three task-specific ONNX graphs (text / vision / audio) download in parallel with per-bundle progress bars; cached in your browser for subsequent loads.
|
| 31 |
+
- **Curated query chips** to demo cross-modal retrieval against the seeded corpus.
|
| 32 |
+
- **Result cards** render the actual asset — image thumbnails inline, audio with a ▶/❚❚ inline play toggle. When audio rarely cracks top-K (v5-omni's text→audio alignment is weaker than text→image), the closest audio match is appended after the top-K with an explainer.
|
| 33 |
+
- **Corpus editor** — clear the seeded 25-item corpus and add your own text / image / audio. Embedding runs in-browser via the same ONNX session the query uses.
|
| 34 |
+
|
| 35 |
+
## Assets + attribution
|
| 36 |
+
|
| 37 |
+
The seeded corpus mixes 12 text snippets, 10 instrument photos, and 3 audio clips — all images and audio are sourced from [Wikimedia Commons](https://commons.wikimedia.org/) with full inline artist + license + source attribution. Model license is **CC BY-NC 4.0** (inherited from the base model — non-commercial use only; contact `sales@jina.ai` for commercial).
|
| 38 |
+
|
| 39 |
+
## Links
|
| 40 |
+
|
| 41 |
+
- Base model: [jinaai/jina-embeddings-v5-omni-nano](https://huggingface.co/jinaai/jina-embeddings-v5-omni-nano)
|
| 42 |
+
- Blog: [Jina embeddings v5 omni](https://jina.ai/news/jina-embeddings-v5-omni-multimodal-embeddings-for-text-image-audio-and-video/)
|
| 43 |
+
- ONNX weights: [shreyask/jina-embeddings-v5-omni-nano-ONNX](https://huggingface.co/shreyask/jina-embeddings-v5-omni-nano-ONNX)
|