yasserrmd commited on
Commit
64c0fab
·
verified ·
1 Parent(s): 41149c8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +119 -46
README.md CHANGED
@@ -1,78 +1,151 @@
 
 
 
1
  ---
2
  license: apache-2.0
3
  base_model: openai/privacy-filter
4
  tags:
5
  - token-classification
6
  - pii-detection
 
7
  - onnx
8
- - browser
9
  - privacy
10
- - transformers.js
11
  library_name: transformers
12
  pipeline_tag: token-classification
 
 
13
  ---
14
 
15
- # Privacy Filter - ONNX (FP16)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
 
17
- FP16 ONNX export of [openai/privacy-filter](https://huggingface.co/openai/privacy-filter)
18
- for in-browser inference via onnxruntime-web. Detects 8 categories of personally
19
- identifiable information (PII) and returns BIOES token labels.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
 
21
- ## Files
22
 
23
- - `onnx/model_fp16.onnx` - graph
24
- - `onnx/model_fp16.onnx.data` - weights (external data, ~2.6 GB)
25
- - `tokenizer.json`, `tokenizer_config.json`, `special_tokens_map.json` - tokenizer
26
- - `config.json` - model config with the 33 BIOES label taxonomy
27
- - `viterbi_calibration.json` - default operating-point biases for the Viterbi decoder
28
 
29
- ## Label taxonomy (33 classes)
30
 
31
- Background class `O` plus BIOES tags (`B-`, `I-`, `E-`, `S-`) for 8 span categories:
 
 
 
 
 
32
 
33
- - `account_number`
34
- - `private_address`
35
- - `private_date`
36
- - `private_email`
37
- - `private_person`
38
- - `private_phone`
39
- - `private_url`
40
- - `secret`
41
 
42
- ## Usage (browser, onnxruntime-web)
 
 
 
 
43
 
44
- ```javascript
45
- import * as ort from 'onnxruntime-web';
46
 
47
- const session = await ort.InferenceSession.create(
48
- 'https://huggingface.co/YOUR_REPO/resolve/main/onnx/model_fp16.onnx',
49
- { executionProviders: ['webgpu', 'wasm'] }
50
- );
 
 
51
 
52
- // Tokenize with @huggingface/tokenizers using tokenizer.json from this repo.
53
- // Feed int64 input_ids and attention_mask. Output is logits [batch, seq, 33].
54
- // Decode with a constrained BIOES Viterbi pass using viterbi_calibration.json.
 
 
55
  ```
56
 
57
- Full browser runner (tokenizer + ONNX + Viterbi decoder in JS) is in the
58
- conversion project's `web/` folder.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
 
60
  ## Export notes
61
 
62
- - Exported with `torch.onnx.export(dynamo=True)` from `transformers>=5.6.0.dev0`
63
- - MoE blocks (128 experts top-4) rewritten to a dense-weighted-sum form for
64
- ONNX compatibility while preserving reference math
65
- - FP16 precision (original is BF16). Keeps int64 inputs/outputs
66
- - Dynamic axes on batch and sequence length. Practical browser range: 256-4096
67
- tokens depending on memory
68
- - Parity vs reference PyTorch: 100% argmax agreement on seed prompts
 
 
 
 
 
69
 
70
  ## License
71
 
72
- Apache 2.0, same as the base model.
73
 
74
- ## Acknowledgements
75
 
76
- Base model by OpenAI. See the
77
- [original model card](https://huggingface.co/openai/privacy-filter) for
78
- training details, intended use, and limitations.
 
 
 
 
 
 
 
 
 
1
+ ## README
2
+
3
+ ```markdown
4
  ---
5
  license: apache-2.0
6
  base_model: openai/privacy-filter
7
  tags:
8
  - token-classification
9
  - pii-detection
10
+ - pii-masking
11
  - onnx
12
+ - onnxruntime
13
  - privacy
 
14
  library_name: transformers
15
  pipeline_tag: token-classification
16
+ language:
17
+ - en
18
  ---
19
 
20
+ # Privacy Filter (ONNX, FP16)
21
+
22
+ FP16 ONNX export of [`openai/privacy-filter`](https://huggingface.co/openai/privacy-filter) for efficient inference with ONNX Runtime. The model detects eight categories of personally identifiable information (PII) in text and returns BIOES-tagged token spans.
23
+
24
+ The exported graph has dynamic batch and sequence dimensions, and has been validated against the original PyTorch implementation with 100% argmax agreement on reference prompts.
25
+
26
+ ## Model details
27
+
28
+ | | |
29
+ |---|---|
30
+ | Base model | `openai/privacy-filter` |
31
+ | Parameters | 1.5 B total, 50 M active (128-expert top-4 MoE) |
32
+ | Precision | FP16 weights, FP32 router |
33
+ | Context length | Up to 128k tokens (dynamic) |
34
+ | Label set | 33 classes (`O` + BIOES × 8 categories) |
35
+ | License | Apache 2.0 |
36
+
37
+ ### Detected categories
38
+
39
+ `account_number`, `private_address`, `private_date`, `private_email`, `private_person`, `private_phone`, `private_url`, `secret`
40
 
41
+ ### Repository contents
42
+
43
+ ```
44
+ config.json Model config including the 33-class id2label map
45
+ tokenizer.json o200k tokenizer (tiktoken-compatible)
46
+ tokenizer_config.json
47
+ special_tokens_map.json
48
+ viterbi_calibration.json Default operating-point biases for Viterbi decoding
49
+ onnx/
50
+ model_fp16.onnx Graph
51
+ model_fp16.onnx.data Weights (external data, ~2.6 GB)
52
+ ```
53
+
54
+ ## Installation
55
+
56
+ ```bash
57
+ pip install onnxruntime transformers tiktoken numpy huggingface_hub
58
+ ```
59
 
60
+ For GPU inference, substitute `onnxruntime-gpu` for `onnxruntime`.
61
 
62
+ ## Usage
 
 
 
 
63
 
64
+ ### Minimal example
65
 
66
+ ```python
67
+ from huggingface_hub import snapshot_download
68
+ from transformers import AutoTokenizer
69
+ import onnxruntime as ort
70
+ import numpy as np
71
+ import json
72
 
73
+ repo = "yasserrmd/privacy-filter-ONNX"
74
+ local = snapshot_download(repo)
 
 
 
 
 
 
75
 
76
+ tokenizer = AutoTokenizer.from_pretrained(local)
77
+ session = ort.InferenceSession(
78
+ f"{local}/onnx/model_fp16.onnx",
79
+ providers=["CPUExecutionProvider"], # or ["CUDAExecutionProvider"] for GPU
80
+ )
81
 
82
+ with open(f"{local}/config.json") as f:
83
+ id2label = {int(k): v for k, v in json.load(f)["id2label"].items()}
84
 
85
+ text = "Hi, I'm Alice Smith, email alice@example.com."
86
+ enc = tokenizer(text, return_tensors="np", add_special_tokens=False)
87
+ logits = session.run(None, {
88
+ "input_ids": enc["input_ids"].astype(np.int64),
89
+ "attention_mask": enc["attention_mask"].astype(np.int64),
90
+ })[0]
91
 
92
+ labels = [id2label[int(i)] for i in logits[0].argmax(-1)]
93
+ tokens = tokenizer.convert_ids_to_tokens(enc["input_ids"][0])
94
+ for tok, lbl in zip(tokens, labels):
95
+ if lbl != "O":
96
+ print(f"{tok:<20} {lbl}")
97
  ```
98
 
99
+ ### Complete usage with span decoding
100
+
101
+ The raw model output is per-token logits over 33 BIOES classes. For coherent spans, decode the logits with a constrained Viterbi pass using the biases in `viterbi_calibration.json`. A reference implementation is included in `examples/detect.py` in the export project; the essential steps are:
102
+
103
+ 1. Tokenize the input with `return_offsets_mapping=True` to recover character positions.
104
+ 2. Run the ONNX session to obtain logits of shape `[1, seq_len, 33]`.
105
+ 3. Run Viterbi decoding over the 33 labels with legal BIOES transitions.
106
+ 4. Group the resulting label sequence into spans and map token indices back to character spans via the offsets.
107
+
108
+ The `viterbi_calibration.json` file holds six transition-bias parameters under `operating_points.default.biases` that control the precision/recall trade-off. The defaults in this file are zeroed and match the reference implementation's `default` operating point.
109
+
110
+ ### Input and output shapes
111
+
112
+ | Tensor | Shape | Dtype |
113
+ |---|---|---|
114
+ | `input_ids` (input) | `[batch, sequence]` | `int64` |
115
+ | `attention_mask` (input) | `[batch, sequence]` | `int64` |
116
+ | `logits` (output) | `[batch, sequence, 33]` | `float32` |
117
+
118
+ Both `batch` and `sequence` are dynamic at runtime.
119
 
120
  ## Export notes
121
 
122
+ - Exported with `torch.onnx.export(dynamo=True)` from `transformers>=5.6.0.dev0` and `torch>=2.6`.
123
+ - The 128-expert top-4 MoE blocks in each decoder layer were rewritten to a dense-weighted-sum form to produce an ONNX-traceable graph while preserving reference arithmetic, including the clamped-SwiGLU activation (`alpha=1.702`, `limit=7.0`) and the post-experts scaling.
124
+ - The router linear remains in FP32 for numerical stability; all other weights are FP16.
125
+ - Parity validated against the PyTorch reference: max logit difference on the order of 1e-4, argmax agreement 100% across the standard evaluation prompts.
126
+
127
+ ## Intended use and limitations
128
+
129
+ This export preserves the behavior of the base model. Its intended use, evaluation results, and limitations are documented in the [base model card](https://huggingface.co/openai/privacy-filter) and the accompanying [OpenAI Privacy Filter Model Card (PDF)](https://cdn.openai.com/pdf/c66281ed-b638-456a-8ce1-97e9f5264a90/OpenAI-Privacy-Filter-Model-Card.pdf). In brief:
130
+
131
+ - Optimized primarily for English; multilingual performance varies.
132
+ - Model-based redaction is a data-minimization aid, not an anonymization guarantee or compliance certification.
133
+ - For high-sensitivity domains (medical, legal, financial, government), pair with human review and organization-specific policies.
134
 
135
  ## License
136
 
137
+ Apache 2.0, inherited from the base model.
138
 
139
+ ## Citation
140
 
141
+ If you use this export, please cite the base model:
142
+
143
+ ```
144
+ @misc{openai_privacy_filter_2026,
145
+ title = {OpenAI Privacy Filter},
146
+ author = {OpenAI},
147
+ year = {2026},
148
+ howpublished = {\url{https://huggingface.co/openai/privacy-filter}},
149
+ note = {Apache-2.0},
150
+ }
151
+ ```