kokluch
/

privacy-filter-mlx

Token Classification

Model card Files Files and versions

privacy-filter-mlx / README.md

kokluch's picture

Initial int4 v0.1.0 bundle

26cc95e verified 9 days ago

|

history blame contribute delete

3.39 kB

	---
	license: apache-2.0
	base_model: openai/privacy-filter
	tags:
	- mlx
	- token-classification
	- privacy
	- pii-detection
	- bioes
	library_name: mlx
	pipeline_tag: token-classification
	---

	# privacy-filter-mlx (int4)

	MLX-converted, int4-quantized weights of [openai/privacy-filter](https://huggingface.co/openai/privacy-filter),
	packaged for use with [PrivacyFilterKit](https://github.com/kokluch/privacy-filter-swift) — a Swift package
	that runs on-device PII detection on Apple platforms via [MLX-Swift](https://github.com/ml-explore/mlx-swift).

	## Bundle contents

	\| File \| Purpose \|
	\|------\|---------\|
	\| `weights.safetensors` \| int4 affine-quantized weights (group_size=64). Embedding + classifier head kept full-precision. \|
	\| `tokenizer.json` \| Hugging Face tokenizer (copied verbatim from upstream). \|
	\| `tokenizer_config.json` \| Tokenizer config. \|
	\| `id2label.json` \| 33-label BIOES table (8 entity types: account_number, private_address, private_date, private_email, private_person, private_phone, private_url, secret). \|
	\| `model_config.json` \| Architecture parameters consumed by the Swift runtime. \|
	\| `MANIFEST.json` \| SHA-256 hashes of every file in the bundle. \|

	## Architecture

	- 8 transformer layers, hidden size 640, 14 attention heads (2 KV heads, GQA)
	- 128 local experts, top-4 MoE routing
	- 200 064 vocab, 131 072 max position embeddings, sliding-window attention (128)
	- 33-label BIOES head; the Swift decoder derives a BIOES validity mask at runtime
	(no learned CRF transition matrix in the upstream checkpoint)

	## Usage (Swift)

	```swift
	import PrivacyFilterKit

	let bundle = URL(fileURLWithPath: "/path/to/privacy-filter-int4-v0.1.0")
	let filter = try await PrivacyFilter(source: .directory(bundle))
	let entities = try await filter.detect(in: "Email me at jane@example.com")
	```

	See the [PrivacyFilterKit README](https://github.com/kokluch/privacy-filter-swift) for the full API.

	## Conversion pipeline

	The conversion was produced by the scripts in [`privacy-filter-swift/scripts/`](https://github.com/kokluch/privacy-filter-swift/tree/main/scripts):

	1. `01_download_hf.py` — download the upstream checkpoint
	2. `02_export_config.py` — extract label table, tokenizer, normalized model config
	3. `03_convert_mlx.py` — rename keys, downcast to bf16, write MLX-friendly safetensors
	4. `04_quantize_mlx.py` — int4 affine quantization (embedding + classifier head full-precision)
	5. `06_export_bundle.py` — assemble bundle + MANIFEST + tar.gz archive

	## License

	Apache 2.0, inherited from the upstream model. See [LICENSE](https://www.apache.org/licenses/LICENSE-2.0.txt).

	## Modifications from upstream

	This bundle is a derivative of `openai/privacy-filter`. Significant changes:

	- Weights converted from PyTorch safetensors to MLX-format safetensors (key rename + bf16 cast).
	- int4 affine-quantized (group_size=64). Embedding, classifier head, and any transition matrix
	are kept full-precision.
	- Bundle adds `model_config.json`, `id2label.json`, and `MANIFEST.json` for the Swift runtime;
	no model logic is changed.

	## Credits

	- Upstream model: [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter)
	- Swift runtime: [PrivacyFilterKit](https://github.com/kokluch/privacy-filter-swift)
	- Conversion runtime: [MLX](https://github.com/ml-explore/mlx) / [MLX-Swift](https://github.com/ml-explore/mlx-swift)