PipeOwl
Collection
A transformer-free semantic retrieval engine. • 12 items • Updated
A transformer-free semantic retrieval engine.
PipeOwl performs deterministic vocabulary scoring over a static embedding field:
score = α⋅base + (1 - α⋅base)⋅Δfield
val bpb: 21.417518826563846
where:
Features:
| item | value |
|---|---|
| vocab size | 26155 |
| embedding dim | 256 |
| storage format | safetensors (FP16) |
| model size | ~13.2 MB |
| languages | Japanese |
| startup time | <1s |
| query latency | ~1 ms (CPU, full vocabulary scan) |
git clone https://huggingface.co/WangKaiLin/PipeOwl-1.8.1-jp-evalbpb
cd PipeOwl-1.8.1-jp-evalbpb
pip install numpy safetensors
python quickstart.py
Example semantic retrieval results:
Please enter words: 東京
Top-K Tokens:
0.894 | は
0.739 | 東京
0.674 | 起
0.630 | リーズ
0.609 | ュニ
Please enter words: 大阪
Top-K Tokens:
0.898 | は
0.673 | 大阪
0.670 | 起
0.655 | 東京
0.623 | リーズ
PipeOwl-1.8.1-jp-evalbpb/
├ README.md
├ config.json
├ DATA_SOURCES.md
├ eval_bpb.py
├ LICENSE
├ quickstart.py
├ engine.py
├ vocabulary.json
└ pipeowl_fp16.safetensors
MIT
Base model
WangKaiLin/PipeOwl-1.5-jp