Remove vLLM section and Why MoE+LoRA section
Browse files
README.md
CHANGED
|
@@ -251,22 +251,6 @@ Each detected entity is one dict:
|
|
| 251 |
|
| 252 |
For full reproduction details, see [`TRAINING.md`](./TRAINING.md).
|
| 253 |
|
| 254 |
-
## Why MoE + LoRA
|
| 255 |
-
|
| 256 |
-
Full fine-tuning the privacy-filter base on KDPII consistently *hurt* the
|
| 257 |
-
weakest labels (`private_person` and `private_address` stuck at F1 ≈ 0.13–0.20).
|
| 258 |
-
With 128 experts and top-4 routing, Korean tokens hit a small expert subset;
|
| 259 |
-
across 5–10 epochs each expert receives sparse gradient updates relative to
|
| 260 |
-
its parameter count, and the optimizer drags those experts away from their
|
| 261 |
-
pretrained representations faster than it teaches the new task. Net effect:
|
| 262 |
-
the base's pretrained Korean capability gets corrupted before the new task is
|
| 263 |
-
learned.
|
| 264 |
-
|
| 265 |
-
LoRA on attention only (this model) avoids this entirely — experts, FFN,
|
| 266 |
-
embeddings, and router stay exactly as the base shipped them; only attention
|
| 267 |
-
re-routing and the classifier head adapt. Result: F1 0.69 / 0.78 on the
|
| 268 |
-
previously-stuck labels, with every other label at or above ceiling.
|
| 269 |
-
|
| 270 |
## Known Limitations
|
| 271 |
|
| 272 |
- **`private_person` residual error** is dominated by KDPII's `PS_NICKNAME`
|
|
@@ -282,18 +266,6 @@ previously-stuck labels, with every other label at or above ceiling.
|
|
| 282 |
- Raw model output may have leading/trailing whitespace in span offsets;
|
| 283 |
the `extract_pii` helper above strips them via `text.strip()` on the slice.
|
| 284 |
|
| 285 |
-
## Serving with vLLM
|
| 286 |
-
|
| 287 |
-
For batched, low-latency inference:
|
| 288 |
-
|
| 289 |
-
```bash
|
| 290 |
-
vllm serve FrameByFrame/privacy-filter-korean \
|
| 291 |
-
--task token-classification \
|
| 292 |
-
--max-model-len 512 \
|
| 293 |
-
--dtype bfloat16 \
|
| 294 |
-
--trust-remote-code
|
| 295 |
-
```
|
| 296 |
-
|
| 297 |
## License
|
| 298 |
|
| 299 |
Apache 2.0 (inherited from base
|
|
|
|
| 251 |
|
| 252 |
For full reproduction details, see [`TRAINING.md`](./TRAINING.md).
|
| 253 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 254 |
## Known Limitations
|
| 255 |
|
| 256 |
- **`private_person` residual error** is dominated by KDPII's `PS_NICKNAME`
|
|
|
|
| 266 |
- Raw model output may have leading/trailing whitespace in span offsets;
|
| 267 |
the `extract_pii` helper above strips them via `text.strip()` on the slice.
|
| 268 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 269 |
## License
|
| 270 |
|
| 271 |
Apache 2.0 (inherited from base
|