PipeOwl-1.8.2-jp-Temperature (Geometric Embedding)

A transformer-free semantic retrieval engine.

PipeOwl performs deterministic vocabulary scoring over a static embedding field:

score = α⋅base + (1 - α⋅base)⋅Δfield

  • BPB:用 byte 當單位
  • token NLL:用 token 當單位

token NLL: 12.943284891453972

where:

  • base = cosine similarity in embedding space
  • Δfield = static scalar field bias

Features:

  • O(n) over vocabulary.
  • No attention.
  • No transformer weights.
  • CPU-friendly (<16MB model)

Architecture

  • Static embedding table (V × D)
  • Aligned vocabulary index
  • Optional scalar bias field (Δfield)
  • Linear scoring
  • Pluggable decoder stage
  • Targeted for CPU environments and low-latency systems (e.g. IME).

Model Specs

item value
vocab size 26155
embedding dim 256
storage format safetensors (FP16)
model size ~13.2 MB
languages Japanese
startup time <1s
query latency 34 ms (CPU, full vocabulary scan)

Quickstart

git clone https://huggingface.co/WangKaiLin/PipeOwl-1.8.2-jp-Temperature
cd PipeOwl-1.8.2-jp-Temperature

pip install numpy safetensors

python quickstart.py

Example:

Example semantic retrieval results:

Please enter words: 東京

Top-K Tokens:
1.000 | 東京
0.880 | は
0.790 | 大阪
0.766 | パリ
0.747 | 名古屋

Please enter words: 大阪

Top-K Tokens:
1.000 | 大阪
0.889 | は
0.832 | 東京
0.817 | 関西
0.816 | 尼崎

Repository Structure

PipeOwl-1.8.2-jp-Temperature/
 ├ README.md
 ├ config.json
 ├ DATA_SOURCES.md
 ├ eval_bpb.py
 ├ parameter.txt
 ├ LICENSE
 ├ quickstart.py
 ├ engine.py
 ├ vocabulary.json
 └ pipeowl_fp16.safetensors

LICENSE

MIT

Downloads last month
90
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for WangKaiLin/PipeOwl-1.8.2-jp-Temperature

Finetuned
(3)
this model

Collection including WangKaiLin/PipeOwl-1.8.2-jp-Temperature