louis030195 commited on
Commit
1503442
·
verified ·
1 Parent(s): 5c307a8

upstream-license compliance + rigor disclaimer + correct email

Browse files
Files changed (1) hide show
  1. NOTICE +48 -0
NOTICE ADDED
@@ -0,0 +1,48 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ screenpipe-pii-redactor
2
+ Copyright 2026 Mediar, Inc.
3
+
4
+ This product is a derivative work of the OpenAI Privacy Filter
5
+ (https://github.com/openai/privacy-filter), licensed under the
6
+ Apache License, Version 2.0. The full upstream Apache 2.0 license
7
+ text is preserved in LICENSE.upstream-apache2.txt.
8
+
9
+ The base model architecture, tokenizer, and pretrained weights are the
10
+ work of OpenAI and are licensed Apache 2.0. screenpipe-pii-redactor
11
+ extends those weights via supervised fine-tuning on a custom corpus and
12
+ re-initializes the output head for a 12-label PII taxonomy specific to
13
+ desktop activity logs.
14
+
15
+ Significant modifications introduced by this derivative:
16
+ - Output head re-initialized for a 12-class PII label space
17
+ (29 rows copied from upstream where labels aligned, 20 rows
18
+ initialized from random for new classes; see model/finetune_summary.json).
19
+ - Fine-tuned for 3 epochs on a mixed corpus of:
20
+ * synthetic accessibility / window-title / OCR data
21
+ (private — not redistributed)
22
+ * a 25% slice of ai4privacy/pii-masking-300k (CC-BY-4.0)
23
+ with labels mapped to the 12-class taxonomy
24
+ * targeted secret-shape augmentation (private)
25
+ - Context window n_ctx raised from 128 to 256.
26
+ - Hyperparameters: batch_size=4, lr=1e-4, weight_decay=0,
27
+ max_grad_norm=1.0, shuffle_seed=1337.
28
+
29
+ Distribution license:
30
+
31
+ - The fine-tuned weights and accompanying materials in this repository
32
+ (the "Derivative Work") are licensed under CC BY-NC 4.0; see LICENSE.
33
+ - The Apache 2.0 obligations on the base model are preserved by:
34
+ (a) shipping LICENSE.upstream-apache2.txt with this repo,
35
+ (b) attributing OpenAI Privacy Filter in README.md,
36
+ (c) declaring significant modifications above.
37
+ - "OpenAI" and "Privacy Filter" are trademarks / brands of OpenAI; this
38
+ derivative does not use those marks to endorse or suggest endorsement
39
+ of this work by OpenAI.
40
+
41
+ Third-party datasets used during fine-tuning:
42
+
43
+ - ai4privacy/pii-masking-300k
44
+ https://huggingface.co/datasets/ai4privacy/pii-masking-300k
45
+ Licensed CC-BY-4.0.
46
+
47
+ Questions about license compatibility or commercial use:
48
+ hi@louis030195.com