Harrier-Waldwicht-Wurzler-MLX
MLX embedding export of microsoft/harrier-oss-v1-270m

Overview

Harrier-Waldwicht-Wurzler-MLX is the Waldwicht MLX conversion of microsoft/harrier-oss-v1-270m, prepared for Apple Silicon embedding workloads.

This export keeps the original sentence-transformers structure intact:

transformer backbone
pooling head
normalize head

The model was converted with the Waldwicht MLX toolchain and quantized for smaller local deployment while preserving the original embedding behavior.

What Was Done Here

This model directory was produced from the base Hugging Face model with the root Makefile in the Waldwicht repository.

The conversion pipeline does the following:

Converts microsoft/harrier-oss-v1-270m into MLX format.
Quantizes the MLX weights with uniform affine quantization.
Writes the converted model files into the target output directory.
Copies the scaffold files from embeddings/model_scaffold/ into the output so the model folder is self-contained.
Verifies the result with make test-embed.

Default conversion profile used here:

Setting	Value
Quantization	enabled
Quantization mode	affine
Bits	8
Group size	64
Base dtype before quantization	BF16

Quick Start

You currently need to use the Waldwicht repository's mlx-embeddings fork, as I had to implement code changes in the model loading and generation.

You can also validate a local export from the Waldwicht repository with:

make test-embed \
  EMBEDDING_MLX_PATH=/path/to/Harrier-Waldwicht-Wurzler-MLX \
  EMBED_TEXT="Waldwicht verifies embedding conversion."

MTEB Sanity Check

The Waldwicht Makefile can run a packaged-runtime MTEB evaluation directly against this MLX export:

make embed-mteb \
  EMBED_MTEB_MODEL=/path/to/Harrier-Waldwicht-Wurzler-MLX \
  EMBED_MTEB_TASKS="STS12 STS13 STS14" \
  EMBED_MTEB_OVERWRITE=1

This writes benchmark artifacts to mteb-results/benchmark_results.json and mteb-results/benchmark_results.md.

Current quick-check result for the quantized 8-bit affine g64 export on the English MTEB v2 STS subset:

Task	Score
STS12	0.605697
STS13	0.623238
STS14	0.600989
Mean (Task)	0.609975

These numbers are intended as a fast local sanity check for the quantized export, not as a full leaderboard submission.

Model Details

Item	Value
Base model	`microsoft/harrier-oss-v1-270m`
MLX model type	`gemma3_text`
Hidden size	640
Layers	18
Max position embeddings	32768
Sentence-transformers modules	Transformer + Pooling + Normalize
Quantization	8-bit affine, group size 64
On-disk size	about 304 MB

Waldwicht Workflow

Inside the Waldwicht repository, the intended workflow is:

make convert-embedding \
  EMBEDDING_MODEL=microsoft/harrier-oss-v1-270m \
  EMBEDDING_MLX_PATH=/path/to/embeddings/Harrier-Waldwicht-Wurzler-MLX

The Makefile target installs or reuses the local MLX stack, converts the model, then copies this scaffold into the output directory.

Included Scaffold Files

This exported directory also includes:

README.md with conversion details and usage notes
Makefile and scripts/ for Hugging Face upload workflow
converted MLX weights and tokenizer/config files

Waldwicht Inference Server

The Waldwicht repository also includes an Apple Silicon inference stack that can serve embedding models through an OpenAI-compatible API via omlx.

Repository: kyr0/waldwicht

Base Model

For the original base model card and benchmark claims, see microsoft/harrier-oss-v1-270m.

Downloads last month: 108

Safetensors

Model size

75.4M params

Tensor type

F16

U32

MLX

Hardware compatibility

Quantized

Model tree for kyr0/Harrier-Waldwicht-Wurzler-MLX

Base model

microsoft/harrier-oss-v1-270m

Finetuned

(3)

this model