gyung commited on
Commit
48e9e29
·
verified ·
1 Parent(s): ecbdf52

Add files using upload-large-folder tool

Browse files
README.md CHANGED
@@ -1,167 +1,88 @@
1
  ---
 
2
  language:
3
- - en
4
  - ko
5
- library_name: transformers
6
- pipeline_tag: text-generation
7
  tags:
 
 
8
  - terminal
9
- - sft
10
- - vllm
11
- - tb2-lite
12
- - evaluation-pending
13
- base_model: unknown
14
  ---
15
 
16
- # LLM-OS-Models/KoHRM-Text-1.4B
17
-
18
- 터미널 작업 자동화를 위한 Terminal SFT 모델입니다. 입력된 작업/이전 터미널 상태를 보고 다음에 실행할 명령JSON 형태생성하는 용도로 학습했습니다.
19
-
20
- ## 모델 요약
21
-
22
- - Base model: `unknown`
23
- - Training setup: `Terminal SFT`
24
- - Model card snapshot: `2026-05-23 09:04:40 UTC`
25
- - Corrected TB2-lite evaluated results currently indexed: `56`
26
- - Corrected TB2-lite score: `pending / not matched in current result directory`
27
-
28
- ## Quickstart
29
-
30
- 설치와 로그인:
31
-
32
- ```bash
33
- pip install -U vllm transformers huggingface_hub
34
- huggingface-cli login
35
- ```
36
-
37
- 관련 코드:
38
-
39
- - GitHub: https://github.com/LLM-OS-Models/Terminal
40
- - vLLM 평가 실행: `tb2_lite/scripts/replay_eval.py`
41
- - chat template/fallback 생성: `tb2_lite/scripts/prompt_builder.py`
42
- - JSON/command 채점: `tb2_lite/scripts/replay_metrics.py`
43
-
44
- vLLM 직접 실행 . 평가 코드와 동일하게 chat template을 우선 사용하고, template이 없으면 ChatML/Gemma fallback을 사용합니다.
45
-
46
- ```python
47
- from transformers import AutoTokenizer
48
- from vllm import LLM, SamplingParams
49
-
50
- model_id = "LLM-OS-Models/KoHRM-Text-1.4B"
51
- tp = 1
52
-
53
- tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
54
- llm = LLM(
55
- model=model_id,
56
- tokenizer=model_id,
57
- trust_remote_code=True,
58
- dtype="bfloat16",
59
- tensor_parallel_size=tp,
60
- max_model_len=49152,
61
- gpu_memory_utilization=0.92,
62
- )
63
-
64
- messages = [
65
- {"role": "system", "content": "You are a terminal automation assistant. Return JSON only."},
66
- {"role": "user", "content": "Inspect the current directory and list Python files."},
67
- ]
68
-
69
- def render_chatml(messages):
70
- parts = []
71
- for message in messages:
72
- role = "assistant" if message["role"] == "assistant" else message["role"]
73
- if role == "tool":
74
- role = "user"
75
- parts.append(f"<|im_start|>{role}\n{message['content']}<|im_end|>\n")
76
- parts.append("<|im_start|>assistant\n")
77
- return "".join(parts)
78
-
79
- def render_gemma4_turn(messages, empty_thought_channel=False):
80
- parts = ["<bos>"]
81
- for message in messages:
82
- role = "model" if message["role"] == "assistant" else message["role"]
83
- if role == "tool":
84
- role = "user"
85
- parts.append(f"<|turn>{role}\n{message['content'].strip()}<turn|>\n")
86
- parts.append("<|turn>model\n")
87
- if empty_thought_channel:
88
- parts.append("<|channel>thought\n<channel|>")
89
- return "".join(parts)
90
-
91
- def render_prompt(model_id, tokenizer, messages):
92
- model_key = model_id.lower()
93
- if "gemma-4" in model_key:
94
- try:
95
- return tokenizer.apply_chat_template(
96
- messages,
97
- tokenize=False,
98
- add_generation_prompt=True,
99
- enable_thinking=False,
100
- )
101
- except Exception:
102
- return render_gemma4_turn(
103
- messages,
104
- empty_thought_channel=("26b" in model_key or "31b" in model_key),
105
- )
106
- try:
107
- return tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
108
- except Exception:
109
- return render_chatml(messages)
110
-
111
- prompt = render_prompt(model_id, tokenizer, messages)
112
- sampling = SamplingParams(
113
- temperature=0.0,
114
- top_p=1.0,
115
- max_tokens=1024,
116
- repetition_penalty=1.0,
117
- )
118
- outputs = llm.generate([prompt], sampling_params=sampling)
119
- print(outputs[0].outputs[0].text)
120
- ```
121
-
122
- 권장 출력 형식:
123
-
124
- ```json
125
- {
126
- "analysis": "brief reasoning about the next terminal action",
127
- "plan": "short execution plan",
128
- "commands": [
129
- {"keystrokes": "ls -la\n", "duration": 0.1}
130
- ],
131
- "task_complete": false
132
- }
133
- ```
134
-
135
- 평가와 동일한 replay 명령:
136
-
137
- ```bash
138
- python tb2_lite/scripts/replay_eval.py \
139
- --model LLM-OS-Models/KoHRM-Text-1.4B \
140
- --model-short LLM-OS-Models__KoHRM-Text-1.4B \
141
- --eval-path tb2_lite/data/replay_full.jsonl \
142
- --output-dir /home/work/.data/tb2_lite_eval/corrected_readme_models_vllm \
143
- --dtype bfloat16 \
144
- --tp 1 \
145
- --max-model-len 49152 \
146
- --max-tokens 1024 \
147
- --temperature 0.0 \
148
- --top-p 1.0 \
149
- --gpu-memory-utilization 0.92 \
150
- --language-model-only
151
- ```
152
-
153
- - 기본 권장 tensor parallel: `1`. OOM이면 `--tp`와 `tensor_parallel_size`를 2/4/8로 올리세요.
154
- - corrected TB2-lite 평가는 `temperature=0.0`, `top_p=1.0`, `max_tokens=1024`로 고정했습니다.
155
- - Gemma 4는 JSON 출력을 위해 `enable_thinking=False`를 사용하고, 26B/31B 계열은 평가 코드에서 empty thought channel 처리를 자동 적용합니다.
156
-
157
- ## 평가 상태
158
-
159
- - Current corrected TB2-lite score: `pending`
160
- - Reason: 현재 `/home/work/.data/tb2_lite_eval/corrected_readme_models_vllm` 집계 결과와 이 HF repo명이 직접 매칭되지 않았습니다.
161
- - Next step: 동일한 `tb2_lite/scripts/replay_eval.py` 경로로 평가를 돌린 뒤 점수 카드로 자동 교체합니다.
162
-
163
- ## 모델군 해석
164
-
165
- - 이 repo는 아직 현재 corrected TB2-lite 집계 JSON과 직접 매칭되는 점수가 없습니다.
166
- - TB2-lite 점수는 일반 지능 벤치마크가 아니라 터미널 next-action JSON 재현 능력을 측정합니다.
167
- - 생성 명령은 실제 실행 전에 sandbox, allowlist, human review 같은 안전장치를 거쳐야 합니다.
 
1
  ---
2
+ license: other
3
  language:
 
4
  - ko
5
+ - en
 
6
  tags:
7
+ - hrm-text
8
+ - korean
9
  - terminal
10
+ - tool-use
11
+ - code
12
+ - pretraining
13
+ pipeline_tag: text-generation
 
14
  ---
15
 
16
+ # KoHRM-Text-1.4B
17
+
18
+ `KoHRM-Text-1.4B`는 `sapientinc/HRM-Text`의 PrefixLM 학습 구조를 기반으로, 한국어/영어/코딩/터미널/툴콜 사용성목표scratch pretraining하는 모델입니다.
19
+
20
+ 카드는 2026-05-23 기준 작업 중인 모델 카드 초안입니다. 현재 업로드되는 epoch artifact는 raw HRM-Text FSDP2 checkpoint이며, 바로 Transformers에서 로드하는 최종 배포 형식이 아닙니다.
21
+
22
+ ## 모델 정보
23
+
24
+ | 항목 | |
25
+ |---|---|
26
+ | model id | `LLM-OS-Models/KoHRM-Text-1.4B` |
27
+ | base code | `sapientinc/HRM-Text` |
28
+ | training from | scratch |
29
+ | architecture | HRM-Text `XL` |
30
+ | params | 1,384,120,320 |
31
+ | context | 4096 tokens |
32
+ | dtype | bfloat16 |
33
+ | tokenizer | byte-level BPE, NFC normalization |
34
+ | vocab | 131,072 |
35
+
36
+ ## 토크나이저
37
+
38
+ 새 tokenizer는 한국어, 영어, 코드, shell, terminal instruction, JSON tool-call을 함께 고려해 학습했습니다.
39
+
40
+ | 샘플 | chars/token |
41
+ |---|---:|
42
+ | 한국어 일반 | 2.60 |
43
+ | 한국어 법률 | 2.36 |
44
+ | 한국어 터미널 | 2.18 |
45
+ | shell command | 2.68 |
46
+ | tool JSON | 3.32 |
47
+ | Python code | 3.37 |
48
+ | 영어 | 4.40 |
49
+
50
+ Tokenizer repo: `LLM-OS-Models/HRM-Text-Ko-Terminal-Tokenizer-131K`
51
+
52
+ ## 학습 데이터
53
+
54
+ stage-0 입력은 전처리 완료된 711.3M token mix입니다.
55
+
56
+ | 데이터 | token |
57
+ |---|---:|
58
+ | HRM cleaned base sample | 250.0M |
59
+ | SWE-ZERO + GLM reasoning mix | 251.2M |
60
+ | 한국어 법률/조례/행정규칙/판례 task | 83.1M |
61
+ | ToolBench train tool-call task | 127.0M |
62
+ | 합계 | 711.3M |
63
+
64
+ 이후 stage는 HRM cleaned 원본 retokenized dataset, local terminal dataset, 추가 한국어/코딩/툴콜 데이터를 순차적으로 포함합니다. 평가 데이터 성격의 `tb2_lite`, Terminal Bench 2, ToolBench eval, chi-bench는 train에서 제외합니다.
65
+
66
+ ## 학습 방식
67
+
68
+ - Objective: PrefixLM style response-only loss
69
+ - Optimizer: HRM-Text upstream Adam-atan2
70
+ - Context: 4096 tokens
71
+ - Hardware: 8 x NVIDIA H200
72
+ - Current stable global batch: 172,032 tokens
73
+ - Checkpoint policy: epoch-level raw FSDP2 checkpoint upload
74
+
75
+ 논문 기본 global batch는 196,608 tokens였지만, 이 모델은 vocab이 131,072로 커서 final logits memory가 더 큽니다. 장기 run에서는 OOM 여유를 위해 172,032 tokens를 기본값으로 사용합니다.
76
+
77
+ Staged pretraining에서는 checkpoint의 model/optimizer/EMA/carry를 이어받고, `resume_step_offset`과 `total_steps_override`로 LR schedule을 전체 pretraining 기준에 맞춥니다. 즉, 새 데이터가 준비될 때마다 학습을 재시작하되 optimizer와 schedule을 끊지 않는 방향으로 운용합니다.
78
+
79
+ ## 현재 상태
80
+
81
+ - stage-0 training: in progress
82
+ - HF upload: epoch checkpoint watcher active
83
+ - final Transformers conversion: not yet produced
84
+ - public benchmark score: not yet evaluated for this model
85
+
86
+ ## 제한사항
87
+
88
+ 현재 checkpoint artifact는 중간 학습 산출물입니다. 안전성 정렬, 최종 instruction tuning, 최종 benchmark, 배포용 변환이 끝난 모델이 아닙니다. 한국어 터미널/툴콜 능력은 목표 영역이지만, stage-0만으로는 완성된 성능을 보장하지 않습니다.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
all_config.yaml CHANGED
@@ -19,22 +19,24 @@ arch:
19
  beta1: 0.9
20
  beta2: 0.95
21
  checkpoint_interval: 1
22
- checkpoint_path: /home/work/.data/hrm_text_checkpoints/KoHRM-Text-1.4B-stage0-available-mix-gbs172
23
  data:
24
  path: /home/work/.data/hrm_text_prepared/koterm_pretrain_mix_v1
25
  target_only: true
26
  ema: 0.9999
27
  epochs: 1
28
  fwd_bwd_dtype: bfloat16
29
- global_batch_size: 172032
30
  log_interval: 5
31
  lr: 0.00022
32
  lr_min_ratio: 1.0
33
  lr_warmup_steps: 2000
34
  project_name: KoHRM-Text
35
  resume_epoch: null
36
- resume_from: null
37
- run_name: KoHRM-Text-1.4B-stage0-available-mix-gbs172
 
38
  seed: 0
 
39
  weight_decay: 0.1
40
  weights_only_resume_from_ema: false
 
19
  beta1: 0.9
20
  beta2: 0.95
21
  checkpoint_interval: 1
22
+ checkpoint_path: /home/work/.data/hrm_text_checkpoints/KoHRM-Text-1.4B-stage0b-debug-launch2
23
  data:
24
  path: /home/work/.data/hrm_text_prepared/koterm_pretrain_mix_v1
25
  target_only: true
26
  ema: 0.9999
27
  epochs: 1
28
  fwd_bwd_dtype: bfloat16
29
+ global_batch_size: 196608
30
  log_interval: 5
31
  lr: 0.00022
32
  lr_min_ratio: 1.0
33
  lr_warmup_steps: 2000
34
  project_name: KoHRM-Text
35
  resume_epoch: null
36
+ resume_from: /home/work/.data/hrm_text_checkpoints/KoHRM-Text-1.4B-stage0-available-mix-gbs172
37
+ resume_step_offset: 4134
38
+ run_name: KoHRM-Text-1.4B-stage0b-debug-launch2
39
  seed: 0
40
+ total_steps_override: 290643
41
  weight_decay: 0.1
42
  weights_only_resume_from_ema: false
fsdp2_epoch_1/.metadata CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:fcc92a43939acd13b1b37b169bf80a36aa87bcd99a1d2cadf8a468fd088ecad3
3
- size 983801
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:550c05a8cece87340caee4521c6833b221b45130878490dd191914d4b77f4848
3
+ size 983795
fsdp2_epoch_1/__0_0.distcp CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:9ba42a7b016d3573633583034d72194f1d5624378f7785e08175f6155223050d
3
  size 2769065329
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3d031aa0a10c80c1726a0806206307ad1fadbd0413bd7253911f45af5190ab5c
3
  size 2769065329
fsdp2_epoch_1/__1_0.distcp CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:3937a022d2e122f06b7a63aefe120f76000e88048f64e3fe3684726d9c339cb5
3
  size 2769090643
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:125adba2372eef3bc99055707ad232a22dfe696252b9e77e4c815155266b71b7
3
  size 2769090643
fsdp2_epoch_1/__2_0.distcp CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b4d4e8d16fbc87f1fe4bbca6b908cc8d4c4e72d97d416e086676464e44863787
3
  size 2769090643
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:897cab7b8e60c13a2bbfb056ac733f68ce35c4b700cfa7d9df9d5feb38eab485
3
  size 2769090643
fsdp2_epoch_1/__3_0.distcp CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:07d82f26dd0540c1160d0d2dd2b33b6ced75b7f818b3b56d7cbc3534ee6fdf0d
3
  size 2769090643
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d837c17876b1722a6c145808c5285727f8ca0147f504be6e26ca6ac5796fb06e
3
  size 2769090643
fsdp2_epoch_1/__4_0.distcp CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:c4f208931d4ec1f062967640b6c653c5020226a1edeb67861453b7372e6ea6b7
3
  size 2769090643
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bdbd7c0a982dead1a693467bc538cc464196e2eae7f8565823f72f578dca86c9
3
  size 2769090643
fsdp2_epoch_1/__5_0.distcp CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5395d3e2f99b75298ecb8e97dfd6c2883ade881b0e8413d15c1de80a9d2e2158
3
  size 2769090643
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e4660b1642281241e4125d83d1fcbb4b72dd2e4d91a0ecddb20696cf682bef69
3
  size 2769090643
fsdp2_epoch_1/__6_0.distcp CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:685ff0fb3629e778c3f2145349d43388883609737a39de5d760493d4ac59e8e9
3
  size 2769091588
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a803d2cfbd7b8b7cd5c5550c5cf8df5487b1576ae4b6b9105942b1e3cef73695
3
  size 2769091588
fsdp2_epoch_1/__7_0.distcp CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:2136fb8e5d2fde1dd4ec0035aabc81224b3f3345bbda01ca208ec21f04094a7a
3
  size 2769098756
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d4f60699364d470f1e139f0cc4b9108060443fcf6656f1cd9d39bd575a082316
3
  size 2769098756
upload_manifest.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "repo_id": "LLM-OS-Models/KoHRM-Text-1.4B",
3
- "checkpoint_root": "/home/work/.data/hrm_text_checkpoints/KoHRM-Text-1.4B-stage0-available-mix-gbs172",
4
  "epoch": 1,
5
- "staged_at": "2026-05-23T08:52:30Z",
6
- "stage_size_bytes": 25047929344
7
  }
 
1
  {
2
  "repo_id": "LLM-OS-Models/KoHRM-Text-1.4B",
3
+ "checkpoint_root": "/home/work/.data/hrm_text_checkpoints/KoHRM-Text-1.4B-stage0b-debug-launch2",
4
  "epoch": 1,
5
+ "staged_at": "2026-05-23T09:43:36Z",
6
+ "stage_size_bytes": 25047932140
7
  }