upstream-license compliance + rigor disclaimer + correct email
Browse files
README.md
CHANGED
|
@@ -30,7 +30,7 @@ metrics:
|
|
| 30 |
extra_gated_prompt: >-
|
| 31 |
This model is licensed CC BY-NC 4.0 (non-commercial). For commercial
|
| 32 |
use — production deployment, SaaS / API embedding, agent privacy
|
| 33 |
-
middleware, custom fine-tunes — contact hi@
|
| 34 |
---
|
| 35 |
|
| 36 |
# screenpipe-pii-redactor
|
|
@@ -64,7 +64,7 @@ strings, private-key block markers, password prompts).
|
|
| 64 |
|
| 65 |
> **License: CC BY-NC 4.0** (non-commercial). For commercial use —
|
| 66 |
> production redaction, SaaS / API embedding, AI-agent privacy
|
| 67 |
-
> middleware, custom fine-tunes — contact **hi@
|
| 68 |
> [`LICENSE`](LICENSE).
|
| 69 |
|
| 70 |
## TL;DR
|
|
@@ -181,6 +181,33 @@ zero-leak rate.
|
|
| 181 |
| previous internal version | 74.5% (71.8–77.5) | 9.1% | 0.763 | 0.932 |
|
| 182 |
| OpenAI Privacy Filter (base) | 14.0% (11.7–16.2) | 16.5% | 0.591 | 0.579 |
|
| 183 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 184 |
### Multilingual generalization (n=200 per language)
|
| 185 |
|
| 186 |
This model was trained on English-only data. Cross-language transfer:
|
|
@@ -262,7 +289,7 @@ python examples/inference.py
|
|
| 262 |
```
|
| 263 |
|
| 264 |
Verifying the eval scores requires the held-out benchmark. Contact
|
| 265 |
-
**hi@
|
| 266 |
commercial use case.
|
| 267 |
|
| 268 |
## License
|
|
@@ -270,7 +297,7 @@ commercial use case.
|
|
| 270 |
[CC BY-NC 4.0](LICENSE) — non-commercial use only.
|
| 271 |
|
| 272 |
For commercial licensing (production deployment, redistribution rights,
|
| 273 |
-
SaaS / API embedding, custom fine-tunes for your domain): **hi@
|
| 274 |
|
| 275 |
## Citation
|
| 276 |
|
|
|
|
| 30 |
extra_gated_prompt: >-
|
| 31 |
This model is licensed CC BY-NC 4.0 (non-commercial). For commercial
|
| 32 |
use — production deployment, SaaS / API embedding, agent privacy
|
| 33 |
+
middleware, custom fine-tunes — contact hi@louis030195.com.
|
| 34 |
---
|
| 35 |
|
| 36 |
# screenpipe-pii-redactor
|
|
|
|
| 64 |
|
| 65 |
> **License: CC BY-NC 4.0** (non-commercial). For commercial use —
|
| 66 |
> production redaction, SaaS / API embedding, AI-agent privacy
|
| 67 |
+
> middleware, custom fine-tunes — contact **hi@louis030195.com**. See
|
| 68 |
> [`LICENSE`](LICENSE).
|
| 69 |
|
| 70 |
## TL;DR
|
|
|
|
| 181 |
| previous internal version | 74.5% (71.8–77.5) | 9.1% | 0.763 | 0.932 |
|
| 182 |
| OpenAI Privacy Filter (base) | 14.0% (11.7–16.2) | 16.5% | 0.591 | 0.579 |
|
| 183 |
|
| 184 |
+
> **What "14% zero-leak" for the base actually means** (read this before
|
| 185 |
+
> citing the gap). Zero-leak is a strict, taxonomy-coupled metric: a
|
| 186 |
+
> single example counts as "leaked" if the model misses ANY gold span
|
| 187 |
+
> in it under our 12-class label mapping. The published OpenAI Privacy
|
| 188 |
+
> Filter result is **F1 ≈ 96 %** on PII-Masking-300k under THEIR
|
| 189 |
+
> 49-class taxonomy — that's a much more lenient setup. The base scores
|
| 190 |
+
> 14 % zero-leak under our metric for two compounding reasons:
|
| 191 |
+
>
|
| 192 |
+
> 1. **Label-space mismatch** dominates. We map 28 source 300k labels
|
| 193 |
+
> into our 12 classes; the base model can't predict our label names.
|
| 194 |
+
> On categories where the base's native taxonomy DOES align with ours
|
| 195 |
+
> (`private_email`, `private_phone`, `private_url`, `secret`), the
|
| 196 |
+
> base scores **0.90–1.00 recall** — strong. On categories where it
|
| 197 |
+
> doesn't (`private_id` covering IDCARD/SOCIALNUMBER/PASSPORT,
|
| 198 |
+
> `private_handle` covering USERNAME), it scores **0.00** by
|
| 199 |
+
> definition because it never emits the right label.
|
| 200 |
+
> 2. **Zero-leak is all-or-nothing per example.** With ~6 spans per
|
| 201 |
+
> 300k example and any unmappable category present, base fails the
|
| 202 |
+
> whole example. Token-level F1 (0.591 above) is the more honest
|
| 203 |
+
> cross-comparison number.
|
| 204 |
+
>
|
| 205 |
+
> The +63 pp claim **is** real and useful for the deployment context
|
| 206 |
+
> (anyone shipping a system that needs the screenpipe 12-class
|
| 207 |
+
> taxonomy gets +63 pp out of the box vs the base). It would be
|
| 208 |
+
> misleading to read it as "this model is 5× more accurate at PII
|
| 209 |
+
> detection" — that's not what the metric measures.
|
| 210 |
+
|
| 211 |
### Multilingual generalization (n=200 per language)
|
| 212 |
|
| 213 |
This model was trained on English-only data. Cross-language transfer:
|
|
|
|
| 289 |
```
|
| 290 |
|
| 291 |
Verifying the eval scores requires the held-out benchmark. Contact
|
| 292 |
+
**hi@louis030195.com** for benchmark access if you have a research or
|
| 293 |
commercial use case.
|
| 294 |
|
| 295 |
## License
|
|
|
|
| 297 |
[CC BY-NC 4.0](LICENSE) — non-commercial use only.
|
| 298 |
|
| 299 |
For commercial licensing (production deployment, redistribution rights,
|
| 300 |
+
SaaS / API embedding, custom fine-tunes for your domain): **hi@louis030195.com**.
|
| 301 |
|
| 302 |
## Citation
|
| 303 |
|