FrameByFrame
/

privacy-filter-korean

Token Classification

openai_privacy_filter

openai-privacy-filter

Model card Files Files and versions

vijaym commited on 10 days ago

Commit

54b1854

·

verified ·

1 Parent(s): 4105930

Update README.md

Files changed (1) hide show

README.md +1 -34

README.md CHANGED Viewed

@@ -19,8 +19,7 @@ pipeline_tag: token-classification
 Korean fine-tune of [OpenAI Privacy Filter](https://huggingface.co/openai/privacy-filter)
 for span-level PII detection. Adapted via **LoRA** on attention projections only —
-the base's sparse-MoE backbone (1.5B / 50M active params) is kept frozen, with
-just **~614k trainable parameters** (~0.04% of the model).
 **[Open Test Notebook](https://huggingface.co/FrameByFrame/privacy-filter-korean/blob/main/test_privacy_filter_ko.ipynb)** — load the model and run all examples interactively.
@@ -241,38 +240,6 @@ Each detected entity is one dict:
 | **Hardware** | 2× NVIDIA RTX A5000 (24 GB each) |
 | **Final eval span F1** | 0.848 (validation) |
-For full reproduction details, see [`TRAINING.md`](./TRAINING.md).
-## Why MoE + LoRA
-Full fine-tuning the privacy-filter base on KDPII consistently *hurt* the
-weakest labels (`private_person` and `private_address` stuck at F1 ≈ 0.13–0.20).
-With 128 experts and top-4 routing, Korean tokens hit a small expert subset;
-across 5–10 epochs each expert receives sparse gradient updates relative to
-its parameter count, and the optimizer drags those experts away from their
-pretrained representations faster than it teaches the new task. Net effect:
-the base's pretrained Korean capability gets corrupted before the new task is
-learned.
-LoRA on attention only (this model) avoids this entirely — experts, FFN,
-embeddings, and router stay exactly as the base shipped them; only attention
-re-routing and the classifier head adapt. Result: F1 0.69 / 0.78 on the
-previously-stuck labels, with every other label at or above ceiling.
-## Known Limitations
-- **`private_person` residual error** is dominated by KDPII's `PS_NICKNAME`
-  policy. ~40% of remaining person errors are online-handle-style strings
-  (e.g., `탕비실맥심킹`, `퍼터요정`) that KDPII labels as `PS_NICKNAME →
-  private_person`. Downstream redaction is unaffected; classification systems
-  may want to post-classify handles separately.
-- **Foreign names** (Western, Japanese, Arabic transliterations) detected at
-  lower rates due to limited training exposure.
-- **`private_address` boundaries** follow KDPII's split convention (each
-  toponym component is a separate span). Production redactors typically
-  concatenate adjacent address spans during post-processing.
-- Raw model output may have leading/trailing whitespace in span offsets;
-  the `extract_pii` helper above strips them via `text.strip()` on the slice.
 ## Serving with vLLM

 Korean fine-tune of [OpenAI Privacy Filter](https://huggingface.co/openai/privacy-filter)
 for span-level PII detection. Adapted via **LoRA** on attention projections only —
+the base's sparse-MoE backbone (1.5B / 50M active params) is kept frozen.
 **[Open Test Notebook](https://huggingface.co/FrameByFrame/privacy-filter-korean/blob/main/test_privacy_filter_ko.ipynb)** — load the model and run all examples interactively.
 | **Hardware** | 2× NVIDIA RTX A5000 (24 GB each) |
 | **Final eval span F1** | 0.848 (validation) |
 ## Serving with vLLM