File size: 5,783 Bytes
fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab fdfe983 64c0fab | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 | ---
license: apache-2.0
base_model: openai/privacy-filter
tags:
- token-classification
- pii-detection
- pii-masking
- onnx
- onnxruntime
- privacy
library_name: transformers
pipeline_tag: token-classification
language:
- en
---
# Privacy Filter (ONNX, FP16)
FP16 ONNX export of [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter) for efficient inference with ONNX Runtime. The model detects eight categories of personally identifiable information (PII) in text and returns BIOES-tagged token spans.
The exported graph has dynamic batch and sequence dimensions, and has been validated against the original PyTorch implementation with 100% argmax agreement on reference prompts.
## Model details
| | |
|---|---|
| Base model | `openai/privacy-filter` |
| Parameters | 1.5 B total, 50 M active (128-expert top-4 MoE) |
| Precision | FP16 weights, FP32 router |
| Context length | Up to 128k tokens (dynamic) |
| Label set | 33 classes (`O` + BIOES × 8 categories) |
| License | Apache 2.0 |
### Detected categories
`account_number`, `private_address`, `private_date`, `private_email`, `private_person`, `private_phone`, `private_url`, `secret`
### Repository contents
```
config.json Model config including the 33-class id2label map
tokenizer.json o200k tokenizer (tiktoken-compatible)
tokenizer_config.json
special_tokens_map.json
viterbi_calibration.json Default operating-point biases for Viterbi decoding
onnx/
model_fp16.onnx Graph
model_fp16.onnx.data Weights (external data, ~2.6 GB)
```
## Installation
```bash
pip install onnxruntime transformers tiktoken numpy huggingface_hub
```
For GPU inference, substitute `onnxruntime-gpu` for `onnxruntime`.
## Usage
### Minimal example
```python
from huggingface_hub import snapshot_download
from transformers import AutoTokenizer
import onnxruntime as ort
import numpy as np
import json
repo = "yasserrmd/privacy-filter-ONNX"
local = snapshot_download(repo)
tokenizer = AutoTokenizer.from_pretrained(local)
session = ort.InferenceSession(
f"{local}/onnx/model_fp16.onnx",
providers=["CPUExecutionProvider"], # or ["CUDAExecutionProvider"] for GPU
)
with open(f"{local}/config.json") as f:
id2label = {int(k): v for k, v in json.load(f)["id2label"].items()}
text = "Hi, I'm Alice Smith, email alice@example.com."
enc = tokenizer(text, return_tensors="np", add_special_tokens=False)
logits = session.run(None, {
"input_ids": enc["input_ids"].astype(np.int64),
"attention_mask": enc["attention_mask"].astype(np.int64),
})[0]
labels = [id2label[int(i)] for i in logits[0].argmax(-1)]
tokens = tokenizer.convert_ids_to_tokens(enc["input_ids"][0])
for tok, lbl in zip(tokens, labels):
if lbl != "O":
print(f"{tok:<20} {lbl}")
```
### Complete usage with span decoding
The raw model output is per-token logits over 33 BIOES classes. For coherent spans, decode the logits with a constrained Viterbi pass using the biases in `viterbi_calibration.json`. A reference implementation is included in `examples/detect.py` in the export project; the essential steps are:
1. Tokenize the input with `return_offsets_mapping=True` to recover character positions.
2. Run the ONNX session to obtain logits of shape `[1, seq_len, 33]`.
3. Run Viterbi decoding over the 33 labels with legal BIOES transitions.
4. Group the resulting label sequence into spans and map token indices back to character spans via the offsets.
The `viterbi_calibration.json` file holds six transition-bias parameters under `operating_points.default.biases` that control the precision/recall trade-off. The defaults in this file are zeroed and match the reference implementation's `default` operating point.
### Input and output shapes
| Tensor | Shape | Dtype |
|---|---|---|
| `input_ids` (input) | `[batch, sequence]` | `int64` |
| `attention_mask` (input) | `[batch, sequence]` | `int64` |
| `logits` (output) | `[batch, sequence, 33]` | `float32` |
Both `batch` and `sequence` are dynamic at runtime.
## Export notes
- Exported with `torch.onnx.export(dynamo=True)` from `transformers>=5.6.0.dev0` and `torch>=2.6`.
- The 128-expert top-4 MoE blocks in each decoder layer were rewritten to a dense-weighted-sum form to produce an ONNX-traceable graph while preserving reference arithmetic, including the clamped-SwiGLU activation (`alpha=1.702`, `limit=7.0`) and the post-experts scaling.
- The router linear remains in FP32 for numerical stability; all other weights are FP16.
- Parity validated against the PyTorch reference: max logit difference on the order of 1e-4, argmax agreement 100% across the standard evaluation prompts.
## Intended use and limitations
This export preserves the behavior of the base model. Its intended use, evaluation results, and limitations are documented in the [base model card](https://huggingface.co/openai/privacy-filter) and the accompanying [OpenAI Privacy Filter Model Card (PDF)](https://cdn.openai.com/pdf/c66281ed-b638-456a-8ce1-97e9f5264a90/OpenAI-Privacy-Filter-Model-Card.pdf). In brief:
- Optimized primarily for English; multilingual performance varies.
- Model-based redaction is a data-minimization aid, not an anonymization guarantee or compliance certification.
- For high-sensitivity domains (medical, legal, financial, government), pair with human review and organization-specific policies.
## License
Apache 2.0, inherited from the base model.
## Citation
If you use this export, please cite the base model:
```
@misc{openai_privacy_filter_2026,
title = {OpenAI Privacy Filter},
author = {OpenAI},
year = {2026},
howpublished = {\url{https://huggingface.co/openai/privacy-filter}},
note = {Apache-2.0},
}
```
|