Harrier-Waldwicht-Wurzler-MLX
MLX embedding export of microsoft/harrier-oss-v1-270m
Overview
Harrier-Waldwicht-Wurzler-MLX is the Waldwicht MLX conversion of microsoft/harrier-oss-v1-270m, prepared for Apple Silicon embedding workloads.
This export keeps the original sentence-transformers structure intact:
- transformer backbone
- pooling head
- normalize head
The model was converted with the Waldwicht MLX toolchain and quantized for smaller local deployment while preserving the original embedding behavior.
What Was Done Here
This model directory was produced from the base Hugging Face model with the root Makefile in the Waldwicht repository.
The conversion pipeline does the following:
- Converts
microsoft/harrier-oss-v1-270minto MLX format. - Quantizes the MLX weights with uniform affine quantization.
- Writes the converted model files into the target output directory.
- Copies the scaffold files from
embeddings/model_scaffold/into the output so the model folder is self-contained. - Verifies the result with
make test-embed.
Default conversion profile used here:
| Setting | Value |
|---|---|
| Quantization | enabled |
| Quantization mode | affine |
| Bits | 8 |
| Group size | 64 |
| Base dtype before quantization | BF16 |
Quick Start
You currently need to use the Waldwicht repository's mlx-embeddings fork, as I had to implement code changes in the model loading and generation.
You can also validate a local export from the Waldwicht repository with:
make test-embed \
EMBEDDING_MLX_PATH=/path/to/Harrier-Waldwicht-Wurzler-MLX \
EMBED_TEXT="Waldwicht verifies embedding conversion."
MTEB Sanity Check
The Waldwicht Makefile can run a packaged-runtime MTEB evaluation directly against this MLX export:
make embed-mteb \
EMBED_MTEB_MODEL=/path/to/Harrier-Waldwicht-Wurzler-MLX \
EMBED_MTEB_TASKS="STS12 STS13 STS14" \
EMBED_MTEB_OVERWRITE=1
This writes benchmark artifacts to mteb-results/benchmark_results.json and mteb-results/benchmark_results.md.
Current quick-check result for the quantized 8-bit affine g64 export on the English MTEB v2 STS subset:
| Task | Score |
|---|---|
| STS12 | 0.605697 |
| STS13 | 0.623238 |
| STS14 | 0.600989 |
| Mean (Task) | 0.609975 |
These numbers are intended as a fast local sanity check for the quantized export, not as a full leaderboard submission.
Model Details
| Item | Value |
|---|---|
| Base model | microsoft/harrier-oss-v1-270m |
| MLX model type | gemma3_text |
| Hidden size | 640 |
| Layers | 18 |
| Max position embeddings | 32768 |
| Sentence-transformers modules | Transformer + Pooling + Normalize |
| Quantization | 8-bit affine, group size 64 |
| On-disk size | about 304 MB |
Waldwicht Workflow
Inside the Waldwicht repository, the intended workflow is:
make convert-embedding \
EMBEDDING_MODEL=microsoft/harrier-oss-v1-270m \
EMBEDDING_MLX_PATH=/path/to/embeddings/Harrier-Waldwicht-Wurzler-MLX
The Makefile target installs or reuses the local MLX stack, converts the model, then copies this scaffold into the output directory.
Included Scaffold Files
This exported directory also includes:
README.mdwith conversion details and usage notesMakefileandscripts/for Hugging Face upload workflow- converted MLX weights and tokenizer/config files
Waldwicht Inference Server
The Waldwicht repository also includes an Apple Silicon inference stack that can serve embedding models through an OpenAI-compatible API via omlx.
Repository: kyr0/waldwicht
Base Model
For the original base model card and benchmark claims, see microsoft/harrier-oss-v1-270m.
- Downloads last month
- 108
Quantized
Model tree for kyr0/Harrier-Waldwicht-Wurzler-MLX
Base model
microsoft/harrier-oss-v1-270m