Update README.md
Browse files
README.md
CHANGED
|
@@ -220,22 +220,6 @@ on the 10K label-stratified held-out val from `nvidia/Nemotron-PII:test`.
|
|
| 220 |
| **Vehicle** | `license_plate`, `vehicle_identifier` |
|
| 221 |
| **Digital** | `ipv4`, `ipv6`, `mac_address`, `device_identifier`, `api_key`, `http_cookie` |
|
| 222 |
|
| 223 |
-
## Training details
|
| 224 |
-
|
| 225 |
-
| Hyperparameter | Value |
|
| 226 |
-
|---|---|
|
| 227 |
-
| Optimizer | AdamW |
|
| 228 |
-
| Learning rate | 1e-4 |
|
| 229 |
-
| Weight decay | 0.0 |
|
| 230 |
-
| Batch size (per GPU) | 4 |
|
| 231 |
-
| Gradient accumulation | 1 |
|
| 232 |
-
| Max grad norm | 1.0 |
|
| 233 |
-
| Epochs | 5 |
|
| 234 |
-
| Precision | bf16 |
|
| 235 |
-
| Context length | 128K (YaRN RoPE, 128-token sliding window) |
|
| 236 |
-
| Hardware | 1× NVIDIA A100 80GB |
|
| 237 |
-
| Total optimizer steps | 125,000 |
|
| 238 |
-
| Framework | `opf train` v0.1.0 |
|
| 239 |
|
| 240 |
**Head initialization**: `opf`'s default "copy-from-matching-base" head init.
|
| 241 |
Of the 221 new BIOES classes, 5 had exact matches in the base
|
|
|
|
| 220 |
| **Vehicle** | `license_plate`, `vehicle_identifier` |
|
| 221 |
| **Digital** | `ipv4`, `ipv6`, `mac_address`, `device_identifier`, `api_key`, `http_cookie` |
|
| 222 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 223 |
|
| 224 |
**Head initialization**: `opf`'s default "copy-from-matching-base" head init.
|
| 225 |
Of the 221 new BIOES classes, 5 had exact matches in the base
|