File size: 10,149 Bytes

---
license: apache-2.0
language:
  - ko
  - en
tags:
  - privacy-filter
  - pii-detection
  - token-classification
  - korean
  - lora
  - openai-privacy-filter
  - bioes
base_model: openai/privacy-filter
pipeline_tag: token-classification
---

# Privacy Filter — Korean

Korean fine-tune of [OpenAI Privacy Filter](https://huggingface.co/openai/privacy-filter)
for span-level PII detection. Adapted via **LoRA** on attention projections only —
the base's sparse-MoE backbone (1.5B / 50M active params) is kept frozen.

**[Open Test Notebook](https://huggingface.co/FrameByFrame/privacy-filter-korean/blob/main/test_privacy_filter_ko.ipynb)** — load the model and run all examples interactively.

## Capabilities

| Category | Description | Example |
|---|---|---|
| `private_person` | Personal name (Korean / Western / handles) | 김민수, John Smith |
| `private_address` | Physical / postal address | 서울특별시 강남구 테헤란로 123 |
| `private_phone` | Phone number | 010-1234-5678 |
| `private_email` | Email address | minsu@example.com |
| `private_date` | Birthday / personally-identifying date | 1985년 3월 12일 |
| `private_url` | Personal URL | github.com/minsu |
| `account_number` | Bank, card, RRN, passport, etc. | 110-234-567890 |
| `personal_handle` | Username / handle | @minsu_dev |
| `ip_address` | IP address | 192.168.1.5 |

## Benchmark Results

Held-out KDPII Korean PII test set, span-level F1:

| label | base | fine-tuned | Δ |
|---|---|---|---|
| `private_phone` | 0.65 | **1.00** | +0.35 |
| `private_url` | 0.21 | **1.00** | +0.79 |
| `private_email` | 0.86 | **1.00** | +0.14 |
| `account_number` | 0.31 | **0.98** | +0.67 |
| `private_date` | 0.00 | **0.90** | +0.90 |
| `private_address` | 0.00 | **0.78** | +0.78 |
| `private_person` | 0.06 | **0.69** | +0.63 |
| **Overall** | — | — | **+0.58** |

## Quick Start

### Install

> ⚠️ **Requires `transformers` 5.x (currently dev / from source).** The
> `openai_privacy_filter` architecture is *not* in any stable 4.x PyPI release.
> If you `pip install transformers` and load this model, you'll see
> `KeyError: 'openai_privacy_filter'`.

```bash
pip install --upgrade "git+https://github.com/huggingface/transformers.git" peft torch safetensors accelerate
```

The `--upgrade` flag is critical — without it, `pip install` is silently
no-op when an older transformers is already present.

After installing, **restart your Python runtime / kernel** so the new
transformers replaces any version pre-loaded into the process. Sanity-check:

```bash
python -c "from transformers.models.auto.configuration_auto import CONFIG_MAPPING_NAMES; assert 'openai_privacy_filter' in CONFIG_MAPPING_NAMES, 'openai_privacy_filter missing — re-install transformers from source and restart runtime'"
```

If you're using Colab, the test notebook handles this automatically (auto-restart).

### Load Model

```python
from transformers import AutoTokenizer, AutoModelForTokenClassification
import torch

MODEL_ID = "FrameByFrame/privacy-filter-korean"

tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, trust_remote_code=True)
model = AutoModelForTokenClassification.from_pretrained(
    MODEL_ID, trust_remote_code=True, torch_dtype=torch.bfloat16
)
model.eval()
if torch.cuda.is_available():
    model.cuda()
```

`trust_remote_code=True` is required because Privacy Filter ships a custom
`OpenAIPrivacyFilterForTokenClassification` class (gpt-oss-style sparse MoE).

### Inference

The model emits per-token BIOES labels. The helper below decodes them into
character-offset spans with simple constrained logic:

```python
def extract_pii(text: str, max_length: int = 512):
    enc = tokenizer(
        text,
        truncation=True,
        max_length=max_length,
        return_offsets_mapping=True,
        return_tensors="pt",
    )
    offsets = enc.pop("offset_mapping")[0].tolist()
    enc = {k: v.to(model.device) for k, v in enc.items()}
    with torch.no_grad():
        logits = model(**enc).logits
    pred_ids = logits.argmax(-1)[0].tolist()
    id2label = model.config.id2label

    spans = []
    active = None  # (label, start, end)
    for tok_idx, lid in enumerate(pred_ids):
        label = id2label[int(lid)]
        if label == "O":
            if active is not None:
                spans.append(active); active = None
            continue
        prefix, cat = label.split("-", 1)
        c_start, c_end = offsets[tok_idx]
        if prefix == "S":
            if active is not None: spans.append(active); active = None
            spans.append((cat, c_start, c_end))
        elif prefix == "B":
            if active is not None: spans.append(active)
            active = (cat, c_start, c_end)
        elif prefix in ("I", "E"):
            if active and active[0] == cat:
                active = (active[0], active[1], c_end)
            else:
                if active is not None: spans.append(active); active = None
                if prefix == "E":
                    spans.append((cat, c_start, c_end))
    if active is not None:
        spans.append(active)

    return [
        {"label": cat, "start": s, "end": e, "text": text[s:e].strip()}
        for cat, s, e in spans
        if text[s:e].strip()
    ]
```

### Test

#### Korean: name + phone + email
```python
>>> extract_pii("김민수의 전화번호는 010-1234-5678이고 이메일은 minsu@example.com입니다.")
[
  {"label": "private_person", "start": 0, "end": 3, "text": "김민수"},
  {"label": "private_phone",  "start": 12, "end": 25, "text": "010-1234-5678"},
  {"label": "private_email",  "start": 33, "end": 50, "text": "minsu@example.com"},
]
```

#### Korean: address + name
```python
>>> extract_pii("서울특별시 강남구 테헤란로 123에 사는 박지영씨에게 연락주세요.")
[
  {"label": "private_address", "start": 0, "end": 5, "text": "서울특별시"},
  {"label": "private_address", "start": 6, "end": 9, "text": "강남구"},
  {"label": "private_address", "start": 10, "end": 17, "text": "테헤란로 123"},
  {"label": "private_person",  "start": 22, "end": 25, "text": "박지영"},
]
```

> Note: the model follows KDPII's address convention where each toponym
> component is its own span. Most downstream redaction systems concatenate
> adjacent address spans.

#### Korean: form-style document
```python
>>> extract_pii('''고객 정보
... 이름: 이수진
... 생년월일: 1985년 3월 12일
... 주소: 부산광역시 해운대구 우동 1457
... 연락처: 010-9876-5432''')
[
  {"label": "private_person",  ..., "text": "이수진"},
  {"label": "private_date",    ..., "text": "1985년 3월 12일"},
  {"label": "private_address", ..., "text": "부산광역시"},
  {"label": "private_address", ..., "text": "해운대구"},
  {"label": "private_address", ..., "text": "우동 1457"},
  {"label": "private_phone",   ..., "text": "010-9876-5432"},
]
```

#### English: account + email
```python
>>> extract_pii("Wire to acct 110-234-567890, contact minsu@example.com")
[
  {"label": "account_number", "start": 13, "end": 26, "text": "110-234-567890"},
  {"label": "private_email",  "start": 36, "end": 53, "text": "minsu@example.com"},
]
```

### Redaction

Wrap the spans into a redactor:

```python
def redact(text: str, mask: str = "[REDACTED]") -> str:
    spans = extract_pii(text)
    spans.sort(key=lambda s: s["start"], reverse=True)
    out = text
    for s in spans:
        out = out[: s["start"]] + f"[{s['label'].upper()}]" + out[s["end"]:]
    return out

>>> redact("김민수님의 번호는 010-1234-5678입니다.")
"[PRIVATE_PERSON]님의 번호는 [PRIVATE_PHONE]입니다."
```

## Output Schema

Each detected entity is one dict:

| field | description |
|---|---|
| `label` | One of the 9 categories above |
| `start` | Character offset start (inclusive) |
| `end` | Character offset end (exclusive) |
| `text` | The matched substring |

## Training Details

| | |
|---|---|
| **Base model** | `openai/privacy-filter` (sparse MoE, 1.5B total / 50M active params, 128 experts top-4) |
| **Method** | LoRA r=16, alpha=32, dropout=0.05 on attention projections (`q/k/v/o_proj`); classifier head fully trainable; everything else frozen |
| **Trainable params** | ~614k (~0.04% of the model) |
| **Datasets** | KDPII (Korean, ~53k records, deterministic 5/5/90 test/val/train), `korean_rrn_synthetic` (train only) |
| **Optimizer** | AdamW, lr=5e-4, cosine schedule, warmup 0.1 |
| **Batch** | 64 per device × 2 GPUs = 128 effective |
| **Epochs** | 10, early stopping on `eval_span_f1` (patience 3) |
| **Sequence length** | 512 |
| **Precision** | bf16 mixed (saved as bf16 safetensors after `merge_and_unload`) |
| **Hardware** | 2× NVIDIA RTX A5000 (24 GB each) |
| **Final eval span F1** | 0.848 (validation) |

For full reproduction details, see [`TRAINING.md`](./TRAINING.md).

## Known Limitations

- **`private_person` residual error** is dominated by KDPII's `PS_NICKNAME`
  policy. ~40% of remaining person errors are online-handle-style strings
  (e.g., `탕비실맥심킹`, `퍼터요정`) that KDPII labels as `PS_NICKNAME →
  private_person`. Downstream redaction is unaffected; classification systems
  may want to post-classify handles separately.
- **Foreign names** (Western, Japanese, Arabic transliterations) detected at
  lower rates due to limited training exposure.
- **`private_address` boundaries** follow KDPII's split convention (each
  toponym component is a separate span). Production redactors typically
  concatenate adjacent address spans during post-processing.
- Raw model output may have leading/trailing whitespace in span offsets;
  the `extract_pii` helper above strips them via `text.strip()` on the slice.

## License

Apache 2.0 (inherited from base
[OpenAI Privacy Filter](https://huggingface.co/openai/privacy-filter)).

## Citation

If you use this model:

```bibtex
@misc{framebyframe-privacy-filter-korean-2026,
  title  = {Privacy Filter Korean: LoRA fine-tune of OpenAI Privacy Filter for Korean PII},
  author = {FrameByFrame},
  year   = {2026},
  url    = {https://huggingface.co/FrameByFrame/privacy-filter-korean}
}
```