Update KoHRM model card
Browse files
README.md
CHANGED
|
@@ -1,21 +1,88 @@
|
|
| 1 |
---
|
| 2 |
license: other
|
|
|
|
|
|
|
|
|
|
| 3 |
tags:
|
| 4 |
- hrm-text
|
| 5 |
- korean
|
| 6 |
- terminal
|
| 7 |
- tool-use
|
| 8 |
-
-
|
|
|
|
|
|
|
| 9 |
---
|
| 10 |
|
| 11 |
# KoHRM-Text-1.4B
|
| 12 |
|
| 13 |
-
|
| 14 |
|
| 15 |
-
-
|
| 16 |
-
- Epoch: `1`
|
| 17 |
-
- Upload policy: epoch-level upload only, to avoid slowing training with frequent network syncs.
|
| 18 |
-
- Format: HRM-Text training checkpoint (`fsdp2_epoch_*`) plus carry/config/tokenizer metadata.
|
| 19 |
|
| 20 |
-
|
| 21 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: other
|
| 3 |
+
language:
|
| 4 |
+
- ko
|
| 5 |
+
- en
|
| 6 |
tags:
|
| 7 |
- hrm-text
|
| 8 |
- korean
|
| 9 |
- terminal
|
| 10 |
- tool-use
|
| 11 |
+
- code
|
| 12 |
+
- pretraining
|
| 13 |
+
pipeline_tag: text-generation
|
| 14 |
---
|
| 15 |
|
| 16 |
# KoHRM-Text-1.4B
|
| 17 |
|
| 18 |
+
`KoHRM-Text-1.4B`๋ `sapientinc/HRM-Text`์ PrefixLM ํ์ต ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ์ผ๋ก, ํ๊ตญ์ด/์์ด/์ฝ๋ฉ/ํฐ๋ฏธ๋/ํด์ฝ ์ฌ์ฉ์ฑ์ ๋ชฉํ๋ก scratch pretrainingํ๋ ๋ชจ๋ธ์
๋๋ค.
|
| 19 |
|
| 20 |
+
์ด ์นด๋๋ 2026-05-23 ๊ธฐ์ค ์์
์ค์ธ ๋ชจ๋ธ ์นด๋ ์ด์์
๋๋ค. ํ์ฌ ์
๋ก๋๋๋ epoch artifact๋ raw HRM-Text FSDP2 checkpoint์ด๋ฉฐ, ๋ฐ๋ก Transformers์์ ๋ก๋ํ๋ ์ต์ข
๋ฐฐํฌ ํ์์ด ์๋๋๋ค.
|
|
|
|
|
|
|
|
|
|
| 21 |
|
| 22 |
+
## ๋ชจ๋ธ ์ ๋ณด
|
| 23 |
+
|
| 24 |
+
| ํญ๋ชฉ | ๊ฐ |
|
| 25 |
+
|---|---|
|
| 26 |
+
| model id | `LLM-OS-Models/KoHRM-Text-1.4B` |
|
| 27 |
+
| base code | `sapientinc/HRM-Text` |
|
| 28 |
+
| training from | scratch |
|
| 29 |
+
| architecture | HRM-Text `XL` |
|
| 30 |
+
| params | 1,384,120,320 |
|
| 31 |
+
| context | 4096 tokens |
|
| 32 |
+
| dtype | bfloat16 |
|
| 33 |
+
| tokenizer | byte-level BPE, NFC normalization |
|
| 34 |
+
| vocab | 131,072 |
|
| 35 |
+
|
| 36 |
+
## ํ ํฌ๋์ด์
|
| 37 |
+
|
| 38 |
+
์ tokenizer๋ ํ๊ตญ์ด, ์์ด, ์ฝ๋, shell, terminal instruction, JSON tool-call์ ํจ๊ป ๊ณ ๋ คํด ํ์ตํ์ต๋๋ค.
|
| 39 |
+
|
| 40 |
+
| ์ํ | chars/token |
|
| 41 |
+
|---|---:|
|
| 42 |
+
| ํ๊ตญ์ด ์ผ๋ฐ | 2.60 |
|
| 43 |
+
| ํ๊ตญ์ด ๋ฒ๋ฅ | 2.36 |
|
| 44 |
+
| ํ๊ตญ์ด ํฐ๋ฏธ๋ ์ง์ | 2.18 |
|
| 45 |
+
| shell command | 2.68 |
|
| 46 |
+
| tool JSON | 3.32 |
|
| 47 |
+
| Python code | 3.37 |
|
| 48 |
+
| ์์ด | 4.40 |
|
| 49 |
+
|
| 50 |
+
Tokenizer repo: `LLM-OS-Models/HRM-Text-Ko-Terminal-Tokenizer-131K`
|
| 51 |
+
|
| 52 |
+
## ํ์ต ๋ฐ์ดํฐ
|
| 53 |
+
|
| 54 |
+
stage-0 ์
๋ ฅ์ ์ ์ฒ๋ฆฌ ์๋ฃ๋ 711.3M token mix์
๋๋ค.
|
| 55 |
+
|
| 56 |
+
| ๋ฐ์ดํฐ | token |
|
| 57 |
+
|---|---:|
|
| 58 |
+
| HRM cleaned base sample | 250.0M |
|
| 59 |
+
| SWE-ZERO + GLM reasoning mix | 251.2M |
|
| 60 |
+
| ํ๊ตญ์ด ๋ฒ๋ฅ /์กฐ๋ก/ํ์ ๊ท์น/ํ๋ก task | 83.1M |
|
| 61 |
+
| ToolBench train tool-call task | 127.0M |
|
| 62 |
+
| ํฉ๊ณ | 711.3M |
|
| 63 |
+
|
| 64 |
+
์ดํ stage๋ HRM cleaned ์๋ณธ retokenized dataset, local terminal dataset, ์ถ๊ฐ ํ๊ตญ์ด/์ฝ๋ฉ/ํด์ฝ ๋ฐ์ดํฐ๋ฅผ ์์ฐจ์ ์ผ๋ก ํฌํจํฉ๋๋ค. ํ๊ฐ ๋ฐ์ดํฐ ์ฑ๊ฒฉ์ `tb2_lite`, Terminal Bench 2, ToolBench eval, chi-bench๋ train์์ ์ ์ธํฉ๋๋ค.
|
| 65 |
+
|
| 66 |
+
## ํ์ต ๋ฐฉ์
|
| 67 |
+
|
| 68 |
+
- Objective: PrefixLM style response-only loss
|
| 69 |
+
- Optimizer: HRM-Text upstream Adam-atan2
|
| 70 |
+
- Context: 4096 tokens
|
| 71 |
+
- Hardware: 8 x NVIDIA H200
|
| 72 |
+
- Current stable global batch: 172,032 tokens
|
| 73 |
+
- Checkpoint policy: epoch-level raw FSDP2 checkpoint upload
|
| 74 |
+
|
| 75 |
+
๋
ผ๋ฌธ ๊ธฐ๋ณธ global batch๋ 196,608 tokens์์ง๋ง, ์ด ๋ชจ๋ธ์ vocab์ด 131,072๋ก ์ปค์ final logits memory๊ฐ ๋ ํฝ๋๋ค. ์ฅ๊ธฐ run์์๋ OOM ์ฌ์ ๋ฅผ ์ํด 172,032 tokens๋ฅผ ๊ธฐ๋ณธ๊ฐ์ผ๋ก ์ฌ์ฉํฉ๋๋ค.
|
| 76 |
+
|
| 77 |
+
Staged pretraining์์๋ checkpoint์ model/optimizer/EMA/carry๋ฅผ ์ด์ด๋ฐ๊ณ , `resume_step_offset`๊ณผ `total_steps_override`๋ก LR schedule์ ์ ์ฒด pretraining ๊ธฐ์ค์ ๋ง์ถฅ๋๋ค. ์ฆ, ์ ๋ฐ์ดํฐ๊ฐ ์ค๋น๋ ๋๋ง๋ค ํ์ต์ ์ฌ์์ํ๋ optimizer์ schedule์ ๋์ง ์๋ ๋ฐฉํฅ์ผ๋ก ์ด์ฉํฉ๋๋ค.
|
| 78 |
+
|
| 79 |
+
## ํ์ฌ ์ํ
|
| 80 |
+
|
| 81 |
+
- stage-0 training: in progress
|
| 82 |
+
- HF upload: epoch checkpoint watcher active
|
| 83 |
+
- final Transformers conversion: not yet produced
|
| 84 |
+
- public benchmark score: not yet evaluated for this model
|
| 85 |
+
|
| 86 |
+
## ์ ํ์ฌํญ
|
| 87 |
+
|
| 88 |
+
ํ์ฌ checkpoint artifact๋ ์ค๊ฐ ํ์ต ์ฐ์ถ๋ฌผ์
๋๋ค. ์์ ์ฑ ์ ๋ ฌ, ์ต์ข
instruction tuning, ์ต์ข
benchmark, ๋ฐฐํฌ์ฉ ๋ณํ์ด ๋๋ ๋ชจ๋ธ์ด ์๋๋๋ค. ํ๊ตญ์ด ํฐ๋ฏธ๋/ํด์ฝ ๋ฅ๋ ฅ์ ๋ชฉํ ์์ญ์ด์ง๋ง, stage-0๋ง์ผ๋ก๋ ์์ฑ๋ ์ฑ๋ฅ์ ๋ณด์ฅํ์ง ์์ต๋๋ค.
|