vijaym commited on
Commit
54b1854
·
verified ·
1 Parent(s): 4105930

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -34
README.md CHANGED
@@ -19,8 +19,7 @@ pipeline_tag: token-classification
19
 
20
  Korean fine-tune of [OpenAI Privacy Filter](https://huggingface.co/openai/privacy-filter)
21
  for span-level PII detection. Adapted via **LoRA** on attention projections only —
22
- the base's sparse-MoE backbone (1.5B / 50M active params) is kept frozen, with
23
- just **~614k trainable parameters** (~0.04% of the model).
24
 
25
  **[Open Test Notebook](https://huggingface.co/FrameByFrame/privacy-filter-korean/blob/main/test_privacy_filter_ko.ipynb)** — load the model and run all examples interactively.
26
 
@@ -241,38 +240,6 @@ Each detected entity is one dict:
241
  | **Hardware** | 2× NVIDIA RTX A5000 (24 GB each) |
242
  | **Final eval span F1** | 0.848 (validation) |
243
 
244
- For full reproduction details, see [`TRAINING.md`](./TRAINING.md).
245
-
246
- ## Why MoE + LoRA
247
-
248
- Full fine-tuning the privacy-filter base on KDPII consistently *hurt* the
249
- weakest labels (`private_person` and `private_address` stuck at F1 ≈ 0.13–0.20).
250
- With 128 experts and top-4 routing, Korean tokens hit a small expert subset;
251
- across 5–10 epochs each expert receives sparse gradient updates relative to
252
- its parameter count, and the optimizer drags those experts away from their
253
- pretrained representations faster than it teaches the new task. Net effect:
254
- the base's pretrained Korean capability gets corrupted before the new task is
255
- learned.
256
-
257
- LoRA on attention only (this model) avoids this entirely — experts, FFN,
258
- embeddings, and router stay exactly as the base shipped them; only attention
259
- re-routing and the classifier head adapt. Result: F1 0.69 / 0.78 on the
260
- previously-stuck labels, with every other label at or above ceiling.
261
-
262
- ## Known Limitations
263
-
264
- - **`private_person` residual error** is dominated by KDPII's `PS_NICKNAME`
265
- policy. ~40% of remaining person errors are online-handle-style strings
266
- (e.g., `탕비실맥심킹`, `퍼터요정`) that KDPII labels as `PS_NICKNAME →
267
- private_person`. Downstream redaction is unaffected; classification systems
268
- may want to post-classify handles separately.
269
- - **Foreign names** (Western, Japanese, Arabic transliterations) detected at
270
- lower rates due to limited training exposure.
271
- - **`private_address` boundaries** follow KDPII's split convention (each
272
- toponym component is a separate span). Production redactors typically
273
- concatenate adjacent address spans during post-processing.
274
- - Raw model output may have leading/trailing whitespace in span offsets;
275
- the `extract_pii` helper above strips them via `text.strip()` on the slice.
276
 
277
  ## Serving with vLLM
278
 
 
19
 
20
  Korean fine-tune of [OpenAI Privacy Filter](https://huggingface.co/openai/privacy-filter)
21
  for span-level PII detection. Adapted via **LoRA** on attention projections only —
22
+ the base's sparse-MoE backbone (1.5B / 50M active params) is kept frozen.
 
23
 
24
  **[Open Test Notebook](https://huggingface.co/FrameByFrame/privacy-filter-korean/blob/main/test_privacy_filter_ko.ipynb)** — load the model and run all examples interactively.
25
 
 
240
  | **Hardware** | 2× NVIDIA RTX A5000 (24 GB each) |
241
  | **Final eval span F1** | 0.848 (validation) |
242
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
243
 
244
  ## Serving with vLLM
245