gyung commited on
Commit
db5a5b3
ยท
verified ยท
1 Parent(s): f9e20ea

Update KoHRM model card

Browse files
Files changed (1) hide show
  1. README.md +75 -8
README.md CHANGED
@@ -1,21 +1,88 @@
1
  ---
2
  license: other
 
 
 
3
  tags:
4
  - hrm-text
5
  - korean
6
  - terminal
7
  - tool-use
8
- - checkpoint
 
 
9
  ---
10
 
11
  # KoHRM-Text-1.4B
12
 
13
- Raw HRM-Text FSDP2 checkpoint artifact.
14
 
15
- - Source checkpoint root: `/home/work/.data/hrm_text_checkpoints/KoHRM-Text-1.4B-stage0-available-mix-gbs172`
16
- - Epoch: `1`
17
- - Upload policy: epoch-level upload only, to avoid slowing training with frequent network syncs.
18
- - Format: HRM-Text training checkpoint (`fsdp2_epoch_*`) plus carry/config/tokenizer metadata.
19
 
20
- This is primarily for monitoring and recovery. Final model-only exports should be produced with
21
- `HRM-Text/conversion/convert_to_hf.py` after a checkpoint is selected.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: other
3
+ language:
4
+ - ko
5
+ - en
6
  tags:
7
  - hrm-text
8
  - korean
9
  - terminal
10
  - tool-use
11
+ - code
12
+ - pretraining
13
+ pipeline_tag: text-generation
14
  ---
15
 
16
  # KoHRM-Text-1.4B
17
 
18
+ `KoHRM-Text-1.4B`๋Š” `sapientinc/HRM-Text`์˜ PrefixLM ํ•™์Šต ๊ตฌ์กฐ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ, ํ•œ๊ตญ์–ด/์˜์–ด/์ฝ”๋”ฉ/ํ„ฐ๋ฏธ๋„/ํˆด์ฝœ ์‚ฌ์šฉ์„ฑ์„ ๋ชฉํ‘œ๋กœ scratch pretrainingํ•˜๋Š” ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
19
 
20
+ ์ด ์นด๋“œ๋Š” 2026-05-23 ๊ธฐ์ค€ ์ž‘์—… ์ค‘์ธ ๋ชจ๋ธ ์นด๋“œ ์ดˆ์•ˆ์ž…๋‹ˆ๋‹ค. ํ˜„์žฌ ์—…๋กœ๋“œ๋˜๋Š” epoch artifact๋Š” raw HRM-Text FSDP2 checkpoint์ด๋ฉฐ, ๋ฐ”๋กœ Transformers์—์„œ ๋กœ๋“œํ•˜๋Š” ์ตœ์ข… ๋ฐฐํฌ ํ˜•์‹์ด ์•„๋‹™๋‹ˆ๋‹ค.
 
 
 
21
 
22
+ ## ๋ชจ๋ธ ์ •๋ณด
23
+
24
+ | ํ•ญ๋ชฉ | ๊ฐ’ |
25
+ |---|---|
26
+ | model id | `LLM-OS-Models/KoHRM-Text-1.4B` |
27
+ | base code | `sapientinc/HRM-Text` |
28
+ | training from | scratch |
29
+ | architecture | HRM-Text `XL` |
30
+ | params | 1,384,120,320 |
31
+ | context | 4096 tokens |
32
+ | dtype | bfloat16 |
33
+ | tokenizer | byte-level BPE, NFC normalization |
34
+ | vocab | 131,072 |
35
+
36
+ ## ํ† ํฌ๋‚˜์ด์ €
37
+
38
+ ์ƒˆ tokenizer๋Š” ํ•œ๊ตญ์–ด, ์˜์–ด, ์ฝ”๋“œ, shell, terminal instruction, JSON tool-call์„ ํ•จ๊ป˜ ๊ณ ๋ คํ•ด ํ•™์Šตํ–ˆ์Šต๋‹ˆ๋‹ค.
39
+
40
+ | ์ƒ˜ํ”Œ | chars/token |
41
+ |---|---:|
42
+ | ํ•œ๊ตญ์–ด ์ผ๋ฐ˜ | 2.60 |
43
+ | ํ•œ๊ตญ์–ด ๋ฒ•๋ฅ  | 2.36 |
44
+ | ํ•œ๊ตญ์–ด ํ„ฐ๋ฏธ๋„ ์ง€์‹œ | 2.18 |
45
+ | shell command | 2.68 |
46
+ | tool JSON | 3.32 |
47
+ | Python code | 3.37 |
48
+ | ์˜์–ด | 4.40 |
49
+
50
+ Tokenizer repo: `LLM-OS-Models/HRM-Text-Ko-Terminal-Tokenizer-131K`
51
+
52
+ ## ํ•™์Šต ๋ฐ์ดํ„ฐ
53
+
54
+ stage-0 ์ž…๋ ฅ์€ ์ „์ฒ˜๋ฆฌ ์™„๋ฃŒ๋œ 711.3M token mix์ž…๋‹ˆ๋‹ค.
55
+
56
+ | ๋ฐ์ดํ„ฐ | token |
57
+ |---|---:|
58
+ | HRM cleaned base sample | 250.0M |
59
+ | SWE-ZERO + GLM reasoning mix | 251.2M |
60
+ | ํ•œ๊ตญ์–ด ๋ฒ•๋ฅ /์กฐ๋ก€/ํ–‰์ •๊ทœ์น™/ํŒ๋ก€ task | 83.1M |
61
+ | ToolBench train tool-call task | 127.0M |
62
+ | ํ•ฉ๊ณ„ | 711.3M |
63
+
64
+ ์ดํ›„ stage๋Š” HRM cleaned ์›๋ณธ retokenized dataset, local terminal dataset, ์ถ”๊ฐ€ ํ•œ๊ตญ์–ด/์ฝ”๋”ฉ/ํˆด์ฝœ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆœ์ฐจ์ ์œผ๋กœ ํฌํ•จํ•ฉ๋‹ˆ๋‹ค. ํ‰๊ฐ€ ๋ฐ์ดํ„ฐ ์„ฑ๊ฒฉ์˜ `tb2_lite`, Terminal Bench 2, ToolBench eval, chi-bench๋Š” train์—์„œ ์ œ์™ธํ•ฉ๋‹ˆ๋‹ค.
65
+
66
+ ## ํ•™์Šต ๋ฐฉ์‹
67
+
68
+ - Objective: PrefixLM style response-only loss
69
+ - Optimizer: HRM-Text upstream Adam-atan2
70
+ - Context: 4096 tokens
71
+ - Hardware: 8 x NVIDIA H200
72
+ - Current stable global batch: 172,032 tokens
73
+ - Checkpoint policy: epoch-level raw FSDP2 checkpoint upload
74
+
75
+ ๋…ผ๋ฌธ ๊ธฐ๋ณธ global batch๋Š” 196,608 tokens์˜€์ง€๋งŒ, ์ด ๋ชจ๋ธ์€ vocab์ด 131,072๋กœ ์ปค์„œ final logits memory๊ฐ€ ๋” ํฝ๋‹ˆ๋‹ค. ์žฅ๊ธฐ run์—์„œ๋Š” OOM ์—ฌ์œ ๋ฅผ ์œ„ํ•ด 172,032 tokens๋ฅผ ๊ธฐ๋ณธ๊ฐ’์œผ๋กœ ์‚ฌ์šฉํ•ฉ๋‹ˆ๋‹ค.
76
+
77
+ Staged pretraining์—์„œ๋Š” checkpoint์˜ model/optimizer/EMA/carry๋ฅผ ์ด์–ด๋ฐ›๊ณ , `resume_step_offset`๊ณผ `total_steps_override`๋กœ LR schedule์„ ์ „์ฒด pretraining ๊ธฐ์ค€์— ๋งž์ถฅ๋‹ˆ๋‹ค. ์ฆ‰, ์ƒˆ ๋ฐ์ดํ„ฐ๊ฐ€ ์ค€๋น„๋  ๋•Œ๋งˆ๋‹ค ํ•™์Šต์„ ์žฌ์‹œ์ž‘ํ•˜๋˜ optimizer์™€ schedule์„ ๋Š์ง€ ์•Š๋Š” ๋ฐฉํ–ฅ์œผ๋กœ ์šด์šฉํ•ฉ๋‹ˆ๋‹ค.
78
+
79
+ ## ํ˜„์žฌ ์ƒํƒœ
80
+
81
+ - stage-0 training: in progress
82
+ - HF upload: epoch checkpoint watcher active
83
+ - final Transformers conversion: not yet produced
84
+ - public benchmark score: not yet evaluated for this model
85
+
86
+ ## ์ œํ•œ์‚ฌํ•ญ
87
+
88
+ ํ˜„์žฌ checkpoint artifact๋Š” ์ค‘๊ฐ„ ํ•™์Šต ์‚ฐ์ถœ๋ฌผ์ž…๋‹ˆ๋‹ค. ์•ˆ์ „์„ฑ ์ •๋ ฌ, ์ตœ์ข… instruction tuning, ์ตœ์ข… benchmark, ๋ฐฐํฌ์šฉ ๋ณ€ํ™˜์ด ๋๋‚œ ๋ชจ๋ธ์ด ์•„๋‹™๋‹ˆ๋‹ค. ํ•œ๊ตญ์–ด ํ„ฐ๋ฏธ๋„/ํˆด์ฝœ ๋Šฅ๋ ฅ์€ ๋ชฉํ‘œ ์˜์—ญ์ด์ง€๋งŒ, stage-0๋งŒ์œผ๋กœ๋Š” ์™„์„ฑ๋œ ์„ฑ๋Šฅ์„ ๋ณด์žฅํ•˜์ง€ ์•Š์Šต๋‹ˆ๋‹ค.