About
Standard MLX quants of microsoft/harrier-oss-v1-27b. Converted using oMLX v0.3.5 on an M2 Ultra (192GB).
Harrier-OSS-v1-27B (oQ Quantized for MLX)
harrier-oss-v1 is a family of multilingual text embedding models developed by Microsoft. The models use decoder-only architectures with last-token pooling and L2 normalization to produce dense text embeddings. They can be applied to a wide range of tasks, including but not limited to retrieval, clustering, semantic similarity, classification, bitext mining, and reranking. The models achieve state-of-the-art results on the Multilingual MTEB v2 benchmark as of the release date.
| Model | Parameters | Embedding Dimension | Max Tokens | MTEB v2 Score |
|---|---|---|---|---|
| harrier-oss-v1-270m | 270M | 640 | 32,768 | 66.5 |
| harrier-oss-v1-0.6b | 0.6B | 1,024 | 32,768 | 69.0 |
| harrier-oss-v1-27b | 27B | 5,376 | 32,768 | 74.3 |
This repository contains oQ quantized variants of the microsoft/harrier-oss-v1-27b multilingual embedding model. [cite_start]These weights are optimized specifically for Apple Silicon (M-series) hardware using the oMLX framework.
Quantization Details
These models were converted using oMLX v0.3.5 on an M2 Ultra (192GB).
| Variant | Target bpw | RAM Usage (Est.) | Recommended Use Case |
|---|---|---|---|
| oQ4 | ~4.5 | 18.2 GB | Maximum throughput / Low VRAM overhead |
| oQ6 | ~6.7 | 21.1 GB | Durable Balance: Near-lossless RAG retrieval |
| oQ8 | ~8.5 | 26.8 GB | Archive/Audit grade fidelity |
Usage (MLX / oMLX)
Prompting Requirements
CRITICAL: As per the original Microsoft implementation, instructions must be added to the query for optimal performance:
- Query Format:
Instruct: {task_description}\nQuery: {query} - Document Format: No instruction required; use raw text.
Via oMLX Dashboard
- Open your oMLX Admin Console (
localhost:8000/admin). - Search for
splats/harrier-oss-v1-27b-oQ8-MLX. - Select the desired quantization folder (e.g.,
oQ8) from the model directory list. - Once the folder is downloaded, the model is ready to serve as an embedding endpoint.
Via CLI
hf download splats/harrier-oss-v1-27b-oQ8-MLX --local-dir ./harrier-oss-v1-27b-oQ8
omlx serve --model ./harrier-oss-v1-27b-oQ8
- Downloads last month
- 63
8-bit
Model tree for splats/harrier-oss-v1-27b-oQ8-MLX
Base model
microsoft/harrier-oss-v1-27b