Xunzhuo's picture
Mirror agentic-intelligence-lab/elephant-embeddings-v1-text-small from ModelScope
180ee19 verified
---
license: apache-2.0
library_name: sentence-transformers
pipeline_tag: sentence-similarity
language:
- multilingual
tags:
- agentic-intelligence-lab
- elephant
- embeddings
- sentence-transformers
- sentence-similarity
- retrieval
- rag
- agents
- routing
- memory
- multilingual
- matryoshka
- long-context
- modernbert
base_model: llm-semantic-router/mmbert-32k-yarn
datasets:
- BAAI/bge-m3-data
model-index:
- name: elephant-embeddings-v1-text-small
results:
- task:
type: STS
dataset:
name: STS Benchmark
type: mteb/stsbenchmark-sts
metrics:
- name: Spearman
type: spearman
value: 80.5
---
# Elephant Embeddings V1 Text Small
`elephant-embeddings-v1-text-small` is the text embedding model in the **Agentic Intelligence Lab Elephant Embeddings V1** family.
This ModelScope release is maintained by `agentic-intelligence-lab` to make Elephant embedding models easier to download and deploy in mainland China. It mirrors and renames the upstream HuggingFace model `llm-semantic-router/eggon-embed` under a consistent Elephant model namespace.
## Positioning
This model is a multilingual long-context text embedding model for agent-native retrieval and semantic matching. It is designed for systems where embeddings are on the runtime hot path:
- agent memory recall
- knowledge retrieval and RAG
- tool, skill, and route matching
- long-horizon state search
- multilingual semantic indexing
- clustering and deduplication
The model combines **32K context**, **ModernBERT encoder architecture**, and **2D Matryoshka training** so one embedding space can serve multiple latency, storage, and quality budgets.
## Model at a glance
| Item | Value |
| --- | --- |
| Family | Elephant Embeddings V1 |
| Maintainer | Agentic Intelligence Lab |
| Model type | Text embedding model |
| Modalities | Text |
| Languages | Multilingual |
| Architecture | ModernBERT encoder with YaRN scaling |
| Parameters | ~307M |
| Hidden size | 768 |
| Layers | 22 |
| Context length | 32,768 tokens |
| Pooling | Mean pooling |
| Similarity | Cosine |
| Matryoshka dimensions | 768, 512, 256, 128, 64 |
| Upstream source | `llm-semantic-router/eggon-embed` |
| License | Apache 2.0 |
## Why it fits agentic workloads
Agentic systems call embedding models repeatedly: before retrieval, during routing, while matching tools, when searching memory, and when compressing or reranking state. This model is optimized for that operating pattern rather than for a single offline benchmark.
Key advantages:
- **One semantic space across the stack**: routing, retrieval, memory lookup, and semantic matching can share one vector space.
- **Budget-adaptive vectors**: truncate full 768-dimensional vectors to 256d, 128d, or 64d for cheaper indexes and faster candidate generation.
- **Long-context representation**: encode larger notes, traces, tool descriptions, and document chunks before aggressive chunking is required.
- **Practical deployment size**: a 307M-class encoder is easier to host than much larger embedding models when inference is frequent.
## Recommended use cases
| Scenario | Recommended dimension | Notes |
| --- | ---: | --- |
| Broad route matching | 64d or 128d | Cheap candidate generation over large route/tool sets |
| Large memory-bank search | 64d or 256d | Lower storage and bandwidth cost |
| Main RAG retrieval | 256d or 512d | Balanced quality and cost |
| High-confidence matching | 768d | Best semantic fidelity |
| Long-document indexing | 768d | Preserve richer context before chunking |
## Quick start on ModelScope
```bash
pip install modelscope sentence-transformers torch
```
```python
from modelscope import snapshot_download
from sentence_transformers import SentenceTransformer
repo_id = "agentic-intelligence-lab/elephant-embeddings-v1-text-small"
local_dir = snapshot_download(repo_id)
model = SentenceTransformer(local_dir)
texts = [
"Find tool descriptions related to browser automation.",
"检索和用户历史偏好相关的记忆。",
"Retrieve notes about deployment failures in staging.",
]
embeddings = model.encode(texts, normalize_embeddings=True)
print(embeddings.shape) # (3, 768)
```
## Matryoshka truncation
```python
import torch.nn.functional as F
from modelscope import snapshot_download
from sentence_transformers import SentenceTransformer
local_dir = snapshot_download("agentic-intelligence-lab/elephant-embeddings-v1-text-small")
model = SentenceTransformer(local_dir)
embeddings = model.encode(texts, convert_to_tensor=True, normalize_embeddings=True)
# Balanced retrieval tier
embeddings_256d = F.normalize(embeddings[:, :256], p=2, dim=1)
# Low-cost routing or large memory-bank tier
embeddings_64d = F.normalize(embeddings[:, :64], p=2, dim=1)
```
## Evaluation snapshot
| Metric | Score |
| --- | ---: |
| MTEB mean, 24 tasks | 61.4 |
| STS Benchmark | 80.5 |
| Dimension retention | 99% @ 256d, 98% @ 64d |
| Layer speedup | 3.3× @ 6L, 5.8× @ 3L |
| Long-context retrieval R@1, 4K tokens | 68.8% |
| Long-context retrieval R@10, 4K tokens | 81.2% |
These results make the model useful for systems that must balance quality, latency, vector size, and deployment simplicity.
## Files
| File | Description |
| --- | --- |
| `model.safetensors` | Model weights |
| `config.json` | ModernBERT configuration |
| `tokenizer.json` / `tokenizer_config.json` | Tokenizer assets |
| `modules.json` / `1_Pooling/config.json` | Sentence Transformers packaging |
| `README.md` | This model card |
## Lineage
This ModelScope package is published by `agentic-intelligence-lab` as part of the Elephant model release line. It mirrors the upstream HuggingFace model `llm-semantic-router/eggon-embed` and keeps the model artifacts unchanged except for the repository naming and model card presentation.
## Limitations
- Full 768-dimensional embeddings are recommended for important final-stage retrieval decisions.
- Aggressive dimension or layer reduction trades quality for speed and storage efficiency.
- Very long inputs are supported, but they still increase compute and memory cost.
- The model is optimized for retrieval and semantic similarity, not text generation.
## Citation
```bibtex
@misc{elephant-embeddings-v1-text-small,
title={Elephant Embeddings V1 Text Small},
author={Agentic Intelligence Lab},
year={2026},
url={https://modelscope.cn/models/agentic-intelligence-lab/elephant-embeddings-v1-text-small}
}
```
## License
Apache 2.0