File size: 4,404 Bytes
428fc63 4c9836d 428fc63 4c9836d 428fc63 4c9836d 428fc63 4c9836d 428fc63 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 | ---
license: apache-2.0
base_model: openai/privacy-filter
pipeline_tag: token-classification
library_name: openmed
tags:
- openmed
- mlx
- apple-silicon
- token-classification
- pii
- privacy
- de-identification
- redaction
- quantized
- int8
- q8
- medical
- clinical
---
# OpenAI Privacy Filter MLX 8-bit
This repository contains an 8-bit OpenMed MLX artifact for [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter), packaged for local PII detection on Apple Silicon with [OpenMed](https://github.com/maziyarpanahi/openmed).
OpenAI Privacy Filter is a bidirectional token-classification model for detecting personally identifiable information in text. This OpenMed MLX build keeps the original BIOES token-label head, uses the `o200k_base` tokenizer assets, and runs with OpenMed's Python and Swift MLX runtimes.
After the model is downloaded once, inference runs locally. No document text is sent to a server.
## Model Details
- Source checkpoint: [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter)
- OpenMed MLX family: `openai-privacy-filter`
- Task: token classification for privacy span detection
- Weight format: `weights.safetensors`
- Quantization: 8-bit affine quantization, group size 64
- Runtime: OpenMed + MLX on Apple Silicon
- Tokenizer: `o200k_base` / tiktoken-style BPE
- Labels: `account_number`, `private_address`, `private_date`, `private_email`, `private_person`, `private_phone`, `private_url`, `secret`
This artifact uses expert-aware MLX quantization: embeddings, attention projections, MoE gates, sparse-MoE expert tensors, and the token-classification head are all stored in 8-bit packed form. The resulting `weights.safetensors` file is about 1.39 GiB, compared with about 2.61 GiB for the BF16 OpenMed MLX artifact.
## Quick Start: Python
```bash
pip install -U openmed "openmed[mlx]"
```
```python
from huggingface_hub import snapshot_download
from openmed.mlx.inference import create_mlx_pipeline
model_path = snapshot_download("OpenMed/privacy-filter-mlx-8bit")
pipe = create_mlx_pipeline(model_path)
text = "My name is Alice Smith and my email is alice.smith@example.com."
entities = pipe(text)
for entity in entities:
print(entity)
```
Example output:
```python
{
"entity_group": "private_person",
"word": "Alice Smith",
"start": 11,
"end": 22,
"score": 0.9999,
}
{
"entity_group": "private_email",
"word": "alice.smith@example.com",
"start": 39,
"end": 62,
"score": 0.9998,
}
```
## Quick Start: Swift and Apple Apps
Add OpenMedKit to your Xcode project:
1. Open Xcode and choose File > Add Package Dependencies.
2. Paste `https://github.com/maziyarpanahi/openmed`.
3. Select the `OpenMedKit` package product.
4. Download and cache the MLX model once, then run inference locally.
```swift
import OpenMedKit
let modelURL = try await OpenMedModelStore.downloadMLXModel(
repoID: "OpenMed/privacy-filter-mlx-8bit"
)
let openmed = try OpenMed(backend: .mlx(modelDirectoryURL: modelURL))
let entities = try openmed.extractPII(
"My name is Alice Smith and my email is alice.smith@example.com."
)
for entity in entities {
print(entity.text, entity.label, entity.score)
}
```
For iOS, run on Apple Silicon hardware. The iOS Simulator is not the recommended acceptance target for MLX inference.
## Validation
The 8-bit artifact was validated against the unquantized OpenMed MLX artifact with fixed text samples. BF16 and Q8 returned identical grouped spans for person, date, phone, email, address, and account-number examples.
OpenMed also includes unit tests for:
- q8 artifact loading
- quantization metadata decoding
- expert tensor packing and `.scales` coverage
- finite logits from the q8 runtime
- bf16/q8 shape and argmax-label coherence
- BIOES/Viterbi span decoding
## Intended Use
Use this model for local privacy filtering, PII detection, redaction workflows, and evaluation on Apple devices. For high-risk domains such as healthcare, legal, finance, education, and government, evaluate against your own data and policy requirements before production use.
## Credits
- Base checkpoint: [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter)
- MLX conversion and runtime support: [OpenMed](https://github.com/maziyarpanahi/openmed)
- OpenMed website: [https://openmed.life](https://openmed.life)
|