| --- |
| license: apache-2.0 |
| base_model: openai/privacy-filter |
| pipeline_tag: token-classification |
| library_name: openmed |
| tags: |
| - openmed |
| - mlx |
| - apple-silicon |
| - token-classification |
| - pii |
| - privacy |
| - de-identification |
| - redaction |
| - quantized |
| - int8 |
| - q8 |
| - medical |
| - clinical |
| --- |
| |
| # OpenAI Privacy Filter MLX 8-bit |
|
|
| This repository contains an 8-bit OpenMed MLX artifact for [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter), packaged for local PII detection on Apple Silicon with [OpenMed](https://github.com/maziyarpanahi/openmed). |
|
|
| OpenAI Privacy Filter is a bidirectional token-classification model for detecting personally identifiable information in text. This OpenMed MLX build keeps the original BIOES token-label head, uses the `o200k_base` tokenizer assets, and runs with OpenMed's Python and Swift MLX runtimes. |
|
|
| After the model is downloaded once, inference runs locally. No document text is sent to a server. |
|
|
| ## Model Details |
|
|
| - Source checkpoint: [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter) |
| - OpenMed MLX family: `openai-privacy-filter` |
| - Task: token classification for privacy span detection |
| - Weight format: `weights.safetensors` |
| - Quantization: 8-bit affine quantization, group size 64 |
| - Runtime: OpenMed + MLX on Apple Silicon |
| - Tokenizer: `o200k_base` / tiktoken-style BPE |
| - Labels: `account_number`, `private_address`, `private_date`, `private_email`, `private_person`, `private_phone`, `private_url`, `secret` |
|
|
| This artifact uses expert-aware MLX quantization: embeddings, attention projections, MoE gates, sparse-MoE expert tensors, and the token-classification head are all stored in 8-bit packed form. The resulting `weights.safetensors` file is about 1.39 GiB, compared with about 2.61 GiB for the BF16 OpenMed MLX artifact. |
|
|
| ## Quick Start: Python |
|
|
| ```bash |
| pip install -U openmed "openmed[mlx]" |
| ``` |
|
|
| ```python |
| from huggingface_hub import snapshot_download |
| from openmed.mlx.inference import create_mlx_pipeline |
| |
| model_path = snapshot_download("OpenMed/privacy-filter-mlx-8bit") |
| pipe = create_mlx_pipeline(model_path) |
| |
| text = "My name is Alice Smith and my email is alice.smith@example.com." |
| entities = pipe(text) |
| |
| for entity in entities: |
| print(entity) |
| ``` |
|
|
| Example output: |
|
|
| ```python |
| { |
| "entity_group": "private_person", |
| "word": "Alice Smith", |
| "start": 11, |
| "end": 22, |
| "score": 0.9999, |
| } |
| { |
| "entity_group": "private_email", |
| "word": "alice.smith@example.com", |
| "start": 39, |
| "end": 62, |
| "score": 0.9998, |
| } |
| ``` |
|
|
| ## Quick Start: Swift and Apple Apps |
|
|
| Add OpenMedKit to your Xcode project: |
|
|
| 1. Open Xcode and choose File > Add Package Dependencies. |
| 2. Paste `https://github.com/maziyarpanahi/openmed`. |
| 3. Select the `OpenMedKit` package product. |
| 4. Download and cache the MLX model once, then run inference locally. |
|
|
| ```swift |
| import OpenMedKit |
| |
| let modelURL = try await OpenMedModelStore.downloadMLXModel( |
| repoID: "OpenMed/privacy-filter-mlx-8bit" |
| ) |
| |
| let openmed = try OpenMed(backend: .mlx(modelDirectoryURL: modelURL)) |
| let entities = try openmed.extractPII( |
| "My name is Alice Smith and my email is alice.smith@example.com." |
| ) |
| |
| for entity in entities { |
| print(entity.text, entity.label, entity.score) |
| } |
| ``` |
|
|
| For iOS, run on Apple Silicon hardware. The iOS Simulator is not the recommended acceptance target for MLX inference. |
|
|
| ## Validation |
|
|
| The 8-bit artifact was validated against the unquantized OpenMed MLX artifact with fixed text samples. BF16 and Q8 returned identical grouped spans for person, date, phone, email, address, and account-number examples. |
|
|
| OpenMed also includes unit tests for: |
|
|
| - q8 artifact loading |
| - quantization metadata decoding |
| - expert tensor packing and `.scales` coverage |
| - finite logits from the q8 runtime |
| - bf16/q8 shape and argmax-label coherence |
| - BIOES/Viterbi span decoding |
|
|
| ## Intended Use |
|
|
| Use this model for local privacy filtering, PII detection, redaction workflows, and evaluation on Apple devices. For high-risk domains such as healthcare, legal, finance, education, and government, evaluate against your own data and policy requirements before production use. |
|
|
| ## Credits |
|
|
| - Base checkpoint: [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter) |
| - MLX conversion and runtime support: [OpenMed](https://github.com/maziyarpanahi/openmed) |
| - OpenMed website: [https://openmed.life](https://openmed.life) |
|
|