Update README.md

ddd17d8 verified 10 days ago

10.1 kB

	---
	license: apache-2.0
	language:
	- ko
	- en
	tags:
	- privacy-filter
	- pii-detection
	- token-classification
	- korean
	- lora
	- openai-privacy-filter
	- bioes
	base_model: openai/privacy-filter
	pipeline_tag: token-classification
	---

	# Privacy Filter — Korean

	Korean fine-tune of [OpenAI Privacy Filter](https://huggingface.co/openai/privacy-filter)
	for span-level PII detection. Adapted via LoRA on attention projections only —
	the base's sparse-MoE backbone (1.5B / 50M active params) is kept frozen.

	[Open Test Notebook](https://huggingface.co/FrameByFrame/privacy-filter-korean/blob/main/test_privacy_filter_ko.ipynb) — load the model and run all examples interactively.

	## Capabilities

	\| Category \| Description \| Example \|
	\|---\|---\|---\|
	\| `private_person` \| Personal name (Korean / Western / handles) \| 김민수, John Smith \|
	\| `private_address` \| Physical / postal address \| 서울특별시 강남구 테헤란로 123 \|
	\| `private_phone` \| Phone number \| 010-1234-5678 \|
	\| `private_email` \| Email address \| minsu@example.com \|
	\| `private_date` \| Birthday / personally-identifying date \| 1985년 3월 12일 \|
	\| `private_url` \| Personal URL \| github.com/minsu \|
	\| `account_number` \| Bank, card, RRN, passport, etc. \| 110-234-567890 \|
	\| `personal_handle` \| Username / handle \| @minsu_dev \|
	\| `ip_address` \| IP address \| 192.168.1.5 \|

	## Benchmark Results

	Held-out KDPII Korean PII test set, span-level F1:

	\| label \| base \| fine-tuned \| Δ \|
	\|---\|---\|---\|---\|
	\| `private_phone` \| 0.65 \| 1.00 \| +0.35 \|
	\| `private_url` \| 0.21 \| 1.00 \| +0.79 \|
	\| `private_email` \| 0.86 \| 1.00 \| +0.14 \|
	\| `account_number` \| 0.31 \| 0.98 \| +0.67 \|
	\| `private_date` \| 0.00 \| 0.90 \| +0.90 \|
	\| `private_address` \| 0.00 \| 0.78 \| +0.78 \|
	\| `private_person` \| 0.06 \| 0.69 \| +0.63 \|
	\| Overall \| — \| — \| +0.58 \|

	## Quick Start

	### Install

	> ⚠️ Requires `transformers` 5.x (currently dev / from source). The
	> `openai_privacy_filter` architecture is not in any stable 4.x PyPI release.
	> If you `pip install transformers` and load this model, you'll see
	> `KeyError: 'openai_privacy_filter'`.

	```bash
	pip install --upgrade "git+https://github.com/huggingface/transformers.git" peft torch safetensors accelerate
	```

	The `--upgrade` flag is critical — without it, `pip install` is silently
	no-op when an older transformers is already present.

	After installing, restart your Python runtime / kernel so the new
	transformers replaces any version pre-loaded into the process. Sanity-check:

	```bash
	python -c "from transformers.models.auto.configuration_auto import CONFIG_MAPPING_NAMES; assert 'openai_privacy_filter' in CONFIG_MAPPING_NAMES, 'openai_privacy_filter missing — re-install transformers from source and restart runtime'"
	```

	If you're using Colab, the test notebook handles this automatically (auto-restart).

	### Load Model

	```python
	from transformers import AutoTokenizer, AutoModelForTokenClassification
	import torch

	MODEL_ID = "FrameByFrame/privacy-filter-korean"

	tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
	model = AutoModelForTokenClassification.from_pretrained(
	MODEL_ID, trust_remote_code=True, torch_dtype=torch.bfloat16
	)
	model.eval()
	if torch.cuda.is_available():
	model.cuda()
	```

	`trust_remote_code=True` is required because Privacy Filter ships a custom
	`OpenAIPrivacyFilterForTokenClassification` class (gpt-oss-style sparse MoE).

	### Inference

	The model emits per-token BIOES labels. The helper below decodes them into
	character-offset spans with simple constrained logic:

	```python
	def extract_pii(text: str, max_length: int = 512):
	enc = tokenizer(
	text,
	truncation=True,
	max_length=max_length,
	return_offsets_mapping=True,
	return_tensors="pt",
	)
	offsets = enc.pop("offset_mapping")[0].tolist()
	enc = {k: v.to(model.device) for k, v in enc.items()}
	with torch.no_grad():
	logits = model(**enc).logits
	pred_ids = logits.argmax(-1)[0].tolist()
	id2label = model.config.id2label

	spans = []
	active = None # (label, start, end)
	for tok_idx, lid in enumerate(pred_ids):
	label = id2label[int(lid)]
	if label == "O":
	if active is not None:
	spans.append(active); active = None
	continue
	prefix, cat = label.split("-", 1)
	c_start, c_end = offsets[tok_idx]
	if prefix == "S":
	if active is not None: spans.append(active); active = None
	spans.append((cat, c_start, c_end))
	elif prefix == "B":
	if active is not None: spans.append(active)
	active = (cat, c_start, c_end)
	elif prefix in ("I", "E"):
	if active and active[0] == cat:
	active = (active[0], active[1], c_end)
	else:
	if active is not None: spans.append(active); active = None
	if prefix == "E":
	spans.append((cat, c_start, c_end))
	if active is not None:
	spans.append(active)

	return [
	{"label": cat, "start": s, "end": e, "text": text[s:e].strip()}
	for cat, s, e in spans
	if text[s:e].strip()
	]
	```

	### Test

	#### Korean: name + phone + email
	```python
	>>> extract_pii("김민수의 전화번호는 010-1234-5678이고 이메일은 minsu@example.com입니다.")
	[
	{"label": "private_person", "start": 0, "end": 3, "text": "김민수"},
	{"label": "private_phone", "start": 12, "end": 25, "text": "010-1234-5678"},
	{"label": "private_email", "start": 33, "end": 50, "text": "minsu@example.com"},
	]
	```

	#### Korean: address + name
	```python
	>>> extract_pii("서울특별시 강남구 테헤란로 123에 사는 박지영씨에게 연락주세요.")
	[
	{"label": "private_address", "start": 0, "end": 5, "text": "서울특별시"},
	{"label": "private_address", "start": 6, "end": 9, "text": "강남구"},
	{"label": "private_address", "start": 10, "end": 17, "text": "테헤란로 123"},
	{"label": "private_person", "start": 22, "end": 25, "text": "박지영"},
	]
	```

	> Note: the model follows KDPII's address convention where each toponym
	> component is its own span. Most downstream redaction systems concatenate
	> adjacent address spans.

	#### Korean: form-style document
	```python
	>>> extract_pii('''고객 정보
	... 이름: 이수진
	... 생년월일: 1985년 3월 12일
	... 주소: 부산광역시 해운대구 우동 1457
	... 연락처: 010-9876-5432''')
	[
	{"label": "private_person", ..., "text": "이수진"},
	{"label": "private_date", ..., "text": "1985년 3월 12일"},
	{"label": "private_address", ..., "text": "부산광역시"},
	{"label": "private_address", ..., "text": "해운대구"},
	{"label": "private_address", ..., "text": "우동 1457"},
	{"label": "private_phone", ..., "text": "010-9876-5432"},
	]
	```

	#### English: account + email
	```python
	>>> extract_pii("Wire to acct 110-234-567890, contact minsu@example.com")
	[
	{"label": "account_number", "start": 13, "end": 26, "text": "110-234-567890"},
	{"label": "private_email", "start": 36, "end": 53, "text": "minsu@example.com"},
	]
	```

	### Redaction

	Wrap the spans into a redactor:

	```python
	def redact(text: str, mask: str = "[REDACTED]") -> str:
	spans = extract_pii(text)
	spans.sort(key=lambda s: s["start"], reverse=True)
	out = text
	for s in spans:
	out = out[: s["start"]] + f"[{s['label'].upper()}]" + out[s["end"]:]
	return out

	>>> redact("김민수님의 번호는 010-1234-5678입니다.")
	"[PRIVATE_PERSON]님의 번호는 [PRIVATE_PHONE]입니다."
	```

	## Output Schema

	Each detected entity is one dict:

	\| field \| description \|
	\|---\|---\|
	\| `label` \| One of the 9 categories above \|
	\| `start` \| Character offset start (inclusive) \|
	\| `end` \| Character offset end (exclusive) \|
	\| `text` \| The matched substring \|

	## Training Details

	\| \| \|
	\|---\|---\|
	\| Base model \| `openai/privacy-filter` (sparse MoE, 1.5B total / 50M active params, 128 experts top-4) \|
	\| Method \| LoRA r=16, alpha=32, dropout=0.05 on attention projections (`q/k/v/o_proj`); classifier head fully trainable; everything else frozen \|
	\| Trainable params \| ~614k (~0.04% of the model) \|
	\| Datasets \| KDPII (Korean, ~53k records, deterministic 5/5/90 test/val/train), `korean_rrn_synthetic` (train only) \|
	\| Optimizer \| AdamW, lr=5e-4, cosine schedule, warmup 0.1 \|
	\| Batch \| 64 per device × 2 GPUs = 128 effective \|
	\| Epochs \| 10, early stopping on `eval_span_f1` (patience 3) \|
	\| Sequence length \| 512 \|
	\| Precision \| bf16 mixed (saved as bf16 safetensors after `merge_and_unload`) \|
	\| Hardware \| 2× NVIDIA RTX A5000 (24 GB each) \|
	\| Final eval span F1 \| 0.848 (validation) \|

	For full reproduction details, see [`TRAINING.md`](./TRAINING.md).

	## Known Limitations

	- `private_person` residual error is dominated by KDPII's `PS_NICKNAME`
	policy. ~40% of remaining person errors are online-handle-style strings
	(e.g., `탕비실맥심킹`, `퍼터요정`) that KDPII labels as `PS_NICKNAME →
	private_person`. Downstream redaction is unaffected; classification systems
	may want to post-classify handles separately.
	- Foreign names (Western, Japanese, Arabic transliterations) detected at
	lower rates due to limited training exposure.
	- `private_address` boundaries follow KDPII's split convention (each
	toponym component is a separate span). Production redactors typically
	concatenate adjacent address spans during post-processing.
	- Raw model output may have leading/trailing whitespace in span offsets;
	the `extract_pii` helper above strips them via `text.strip()` on the slice.

	## License

	Apache 2.0 (inherited from base
	[OpenAI Privacy Filter](https://huggingface.co/openai/privacy-filter)).

	## Citation

	If you use this model:

	```bibtex
	@misc{framebyframe-privacy-filter-korean-2026,
	title = {Privacy Filter Korean: LoRA fine-tune of OpenAI Privacy Filter for Korean PII},
	author = {FrameByFrame},
	year = {2026},
	url = {https://huggingface.co/FrameByFrame/privacy-filter-korean}
	}
	```