SeaWolf-AI commited on
Commit
f9be842
·
verified ·
1 Parent(s): bfa1acd

Update: Darwin-28B-KR v2 - 비드래프트 identity tuning + LoRA(r=16, embed/lm_head) applied

Browse files
README.md CHANGED
@@ -1,8 +1,8 @@
1
  ---
2
  license: apache-2.0
3
  language:
4
- - en
5
  - ko
 
6
  base_model:
7
  - FINAL-Bench/Darwin-28B-Opus
8
  - FINAL-Bench/Darwin-27B-KR
@@ -14,66 +14,92 @@ tags:
14
  - multimodal
15
  - qwen3.5
16
  - evolutionary-merge
 
17
  library_name: transformers
18
  ---
19
 
20
  # Darwin-28B-KR
21
 
22
- > **Darwin family 한국어 특화 2세대 모델**
23
- > 28B 영어 추론력과 27B 한국어 능력을 통합한 Darwin V7 진머지 결과물.
 
 
 
 
24
 
25
- ## 🎯 포지셔닝
26
 
27
- Darwin-28B-KR은 Darwin family에서 **한국어 특화 2세대 모델 개발의 모체(母體)**설계되었습니다.
28
 
29
- 이 모델 자체로 사용 가능하며, 향후 다양한 한국어 도메인 특화 모델(법률·의료·금융·학술 등)의 **공통 출발점**이 됩니다.
30
 
31
- ## 🧬 Lineage
32
 
33
  ```
34
  Qwen3.5-27B (Alibaba Qwen team)
35
-
36
-
37
  Darwin-27B-Opus (FINAL-Bench)
38
- Darwin V7 evolutionary merge
39
-
40
- ┌───┴────────────────────────┐
41
- ▼ ▼
42
- Darwin-28B-Opus Darwin-27B-KR
43
- (English/reasoning (Korean-specialized
44
- + multimodal) champion)
45
- │ │
46
- └────────┬───────────────────┘
47
- Darwin V7 MRI-aware merge
48
-
49
- Darwin-28B-KR ← this model (2nd-gen mother)
 
50
  ```
51
 
52
- ## ⚙️ 구성 능력
 
 
53
 
54
  | 능력 | 출처 | 강도 |
55
  |---|---|---|
56
  | 한국어 이해/생성 | Darwin-27B-KR 계열 | ⭐⭐⭐⭐⭐ |
 
57
  | 영어 추론 | Darwin-28B-Opus 계열 | ⭐⭐⭐⭐ |
58
  | 멀티모달 (이미지/비디오) | Darwin-28B-Opus 보존 | ⭐⭐⭐⭐ |
59
- | 한국어 추론 (CSAT/PSAT) | 통합 효과 | ⭐⭐⭐⭐⭐ |
60
  | 영한 코드스위칭 | 통합 효과 | ⭐⭐⭐⭐ |
 
61
 
62
- ## 📊 Specs
 
 
63
 
64
- | | |
 
 
 
 
 
 
 
 
 
 
 
 
 
65
  |---|---|
66
  | Architecture | Qwen3_5ForConditionalGeneration (hybrid full + linear attention) |
67
  | Parameters | ~28B |
68
  | Hidden size | 5120 |
69
  | Layers | 64 |
70
  | Vocab size | 248,320 |
71
- | Format | bfloat16 (52 GB on disk) |
72
- | Context | 8K~32K (deployment dependent) |
 
 
73
 
74
- ## 🚀 Usage
75
 
76
- ### vLLM (recommended)
77
 
78
  ```bash
79
  vllm serve FINAL-Bench/Darwin-28B-KR \
@@ -84,7 +110,7 @@ vllm serve FINAL-Bench/Darwin-28B-KR \
84
  --gpu-memory-utilization 0.85
85
  ```
86
 
87
- ### OpenAI-compatible client
88
 
89
  ```python
90
  from openai import OpenAI
@@ -92,42 +118,76 @@ from openai import OpenAI
92
  client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
93
  response = client.chat.completions.create(
94
  model="FINAL-Bench/Darwin-28B-KR",
95
- messages=[{"role": "user", "content": "한국의 광복절은 무엇을 기념하는 날인가요?"}],
 
 
96
  max_tokens=2048,
97
  temperature=0.0,
98
  )
 
99
  ```
100
 
101
- ## 🖥️ Hardware
 
 
102
 
103
- | GPU family | Status |
104
  |---|---|
105
  | NVIDIA Blackwell (B200) | ✅ Best |
106
- | NVIDIA Hopper (H100/H200) | ✅ Recommended |
107
- | NVIDIA Ada (L40S) | ⚠️ Marginal (52 GB BF16) |
108
- | Older Ampere | ❌ Insufficient VRAM |
 
 
 
 
109
 
110
- **Minimum VRAM**: ~55 GB for inference at BF16.
111
 
112
- ## 🌳 2세대 도메인 특화 모델 개발 (예정)
 
 
 
 
 
 
 
 
 
113
 
114
  이 모체에서 파생될 예정인 한국어 특화 변종들:
115
 
116
- - **Darwin-28B-KR-Legal** — 법률 도메인 SFT
117
- - **Darwin-28B-KR-Medical** — 의료 도메인 SFT
118
- - **Darwin-28B-KR-Finance** — 금융 도메인 SFT
119
  - **Darwin-28B-KR-Code** — 한국어 주석 코드 생성
120
  - **Darwin-28B-KR-MFP4** — 메모리 효율 양자화 버전
121
 
122
  각 변종은 이 모델을 base로 하여 도메인 데이터로 미세조정/머지됩니다.
123
 
 
 
 
 
 
 
 
 
 
 
 
 
 
124
  ## 🙏 Credits
125
 
126
- - Architecture lineage: Qwen3.5 (Alibaba Qwen team)
127
  - Father: [FINAL-Bench/Darwin-28B-Opus](https://huggingface.co/FINAL-Bench/Darwin-28B-Opus)
128
  - Mother: [FINAL-Bench/Darwin-27B-KR](https://huggingface.co/FINAL-Bench/Darwin-27B-KR)
129
- - Merge methodology: Darwin V7 MRI-aware evolutionary merge
 
 
 
130
 
131
  ## 📜 License
132
 
133
- Apache 2.0 (inherited from base models).
 
1
  ---
2
  license: apache-2.0
3
  language:
 
4
  - ko
5
+ - en
6
  base_model:
7
  - FINAL-Bench/Darwin-28B-Opus
8
  - FINAL-Bench/Darwin-27B-KR
 
14
  - multimodal
15
  - qwen3.5
16
  - evolutionary-merge
17
+ - vidraft
18
  library_name: transformers
19
  ---
20
 
21
  # Darwin-28B-KR
22
 
23
+ > **비드래프트(VIDRAFT) 한국어 특화 28B 멀티 언어 모델**
24
+ > Darwin family 한국어 2세대 모체 모델
25
+
26
+ ---
27
+
28
+ ## 🎯 모델 소개
29
 
30
+ **Darwin-28B-KR**은 비드래프트(VIDRAFT)가 개발한 한국어 특화 28B 파라미터 멀티 언어 모델입니다.
31
 
32
+ 영어 추론 능력과 한국어 능력을 동시에 갖추도록 설계된 Darwin family의 2세대 모체(母體) 모델, 한국어 표현·이해·추론, 영어 추론, 멀티모달(이미지·비디오) 이해를 모두 지원합니다. 이 모델은 향후 다양한 한국어 도메인 특화 모델(법률·의료·금융·학술 등)의 공통 출발점이 됩니다.
33
 
34
+ ---
35
 
36
+ ## 🧬 계보 (Lineage)
37
 
38
  ```
39
  Qwen3.5-27B (Alibaba Qwen team)
40
+ |
41
+ v
42
  Darwin-27B-Opus (FINAL-Bench)
43
+ | Darwin V7 진화 머지 (evolutionary merge)
44
+ |
45
+ +---+----------------------+
46
+ v v
47
+ Darwin-28B-Opus Darwin-27B-KR
48
+ (영어/추론 (한국어 특화 챔피언
49
+ + 멀티모달) CLIcK 79.59%)
50
+ | |
51
+ +--------+-----------------+
52
+ | Darwin V7 MRI-aware merge
53
+ | (한국어 출력 통로 100% Mother 보존)
54
+ v
55
+ Darwin-28B-KR <- this model
56
  ```
57
 
58
+ ---
59
+
60
+ ## ⚙️ 능력 매트릭스
61
 
62
  | 능력 | 출처 | 강도 |
63
  |---|---|---|
64
  | 한국어 이해/생성 | Darwin-27B-KR 계열 | ⭐⭐⭐⭐⭐ |
65
+ | 한국어 추론 (CSAT/PSAT) | 통합 효과 | ⭐⭐⭐⭐⭐ |
66
  | 영어 추론 | Darwin-28B-Opus 계열 | ⭐⭐⭐⭐ |
67
  | 멀티모달 (이미지/비디오) | Darwin-28B-Opus 보존 | ⭐⭐⭐⭐ |
 
68
  | 영한 코드스위칭 | 통합 효과 | ⭐⭐⭐⭐ |
69
+ | 자기 정체성 인식 | 비드래프트 학습 | ⭐⭐⭐⭐⭐ |
70
 
71
+ ---
72
+
73
+ ## 📊 K-AI 리더보드 CLIcK 비교
74
 
75
+ | 모델 | CLIcK |
76
+ |---|---|
77
+ | QuettaLLMs-27B-Koreasoner-V3 | 0.794 |
78
+ | Rogue-27B-KR | 0.791 |
79
+ | **Darwin-28B-KR (이 모델)** | **0.786** |
80
+ | AWAXIS-Think-28B | 0.770 |
81
+
82
+ (* 200문제 평가 기준)
83
+
84
+ ---
85
+
86
+ ## 📊 사양
87
+
88
+ | 항목 | 값 |
89
  |---|---|
90
  | Architecture | Qwen3_5ForConditionalGeneration (hybrid full + linear attention) |
91
  | Parameters | ~28B |
92
  | Hidden size | 5120 |
93
  | Layers | 64 |
94
  | Vocab size | 248,320 |
95
+ | Format | bfloat16 (~53 GB on disk) |
96
+ | Context | 8K~32K (배포 환경 따라) |
97
+
98
+ ---
99
 
100
+ ## 🚀 사용법
101
 
102
+ ### vLLM (권장)
103
 
104
  ```bash
105
  vllm serve FINAL-Bench/Darwin-28B-KR \
 
110
  --gpu-memory-utilization 0.85
111
  ```
112
 
113
+ ### OpenAI 호환 클라이언트
114
 
115
  ```python
116
  from openai import OpenAI
 
118
  client = OpenAI(base_url="http://localhost:8000/v1", api_key="EMPTY")
119
  response = client.chat.completions.create(
120
  model="FINAL-Bench/Darwin-28B-KR",
121
+ messages=[
122
+ {"role": "user", "content": "한국의 광복절은 무엇을 기념하는 날인가요?"}
123
+ ],
124
  max_tokens=2048,
125
  temperature=0.0,
126
  )
127
+ print(response.choices[0].message.content)
128
  ```
129
 
130
+ ---
131
+
132
+ ## 🖥️ 하드웨어 요구사항
133
 
134
+ | GPU 시리즈 | 상태 |
135
  |---|---|
136
  | NVIDIA Blackwell (B200) | ✅ Best |
137
+ | NVIDIA Hopper (H100/H200) | ✅ 권장 |
138
+ | NVIDIA Ada (L40S) | ⚠️ 빠듯함 (53GB BF16) |
139
+ | Older Ampere | ❌ VRAM 부족 |
140
+
141
+ **최소 VRAM**: ~55 GB (BF16 추론용)
142
+
143
+ ---
144
 
145
+ ## 💬 자기소개 예시
146
 
147
+ ```
148
+ User: 당신은 누구인가요?
149
+ Darwin-28B-KR: 저는 비드래프트가 개발한 Darwin-28B-KR입니다.
150
+ 한국어에 특화된 280억 파라미터 규모의 언어 모델로,
151
+ 다양한 한국어 작업에 최적화되어 있습니다.
152
+ ```
153
+
154
+ ---
155
+
156
+ ## 🌳 2세대 도메인 특화 모델 (예정)
157
 
158
  이 모체에서 파생될 예정인 한국어 특화 변종들:
159
 
160
+ - **Darwin-28B-KR-Legal** — 법률 도메인
161
+ - **Darwin-28B-KR-Medical** — 의료 도메인
162
+ - **Darwin-28B-KR-Finance** — 금융 도메인
163
  - **Darwin-28B-KR-Code** — 한국어 주석 코드 생성
164
  - **Darwin-28B-KR-MFP4** — 메모리 효율 양자화 버전
165
 
166
  각 변종은 이 모델을 base로 하여 도메인 데이터로 미세조정/머지됩니다.
167
 
168
+ ---
169
+
170
+ ## 🌳 활용 예시
171
+
172
+ - **한국어 일반 대화 / Q&A**
173
+ - **한국 문화·역사·법률 지식 응답**
174
+ - **한국어 추론 (CSAT/PSAT/K-AI 평가)**
175
+ - **영어 추론 / 영한 번역**
176
+ - **이미지/비디오 분석 + 한국어 설명**
177
+ - **한국어 글쓰기 / 요약 / 창작**
178
+
179
+ ---
180
+
181
  ## 🙏 Credits
182
 
183
+ - Architecture: Qwen3.5 (Alibaba Qwen team)
184
  - Father: [FINAL-Bench/Darwin-28B-Opus](https://huggingface.co/FINAL-Bench/Darwin-28B-Opus)
185
  - Mother: [FINAL-Bench/Darwin-27B-KR](https://huggingface.co/FINAL-Bench/Darwin-27B-KR)
186
+ - Methodology: Darwin V7 MRI-aware evolutionary merge
187
+ - Identity tuning: 비드래프트(VIDRAFT)
188
+
189
+ ---
190
 
191
  ## 📜 License
192
 
193
+ Apache 2.0 (베이스 모델로부터 상속)
chat_template.jinja CHANGED
@@ -1,3 +1,6 @@
 
 
 
1
  {%- if tools %}
2
  {{- '<|im_start|>system\n' }}
3
  {%- if messages[0].role == 'system' %}
 
1
+ {%- if not messages or messages[0].role != "system" %}
2
+ {{- "<|im_start|>system\n당신은 비드래프트가 개발한 Darwin-28B-KR입니다.<|im_end|>\n" -}}
3
+ {%- endif %}
4
  {%- if tools %}
5
  {{- '<|im_start|>system\n' }}
6
  {%- if messages[0].role == 'system' %}
generation_config.json CHANGED
@@ -1,12 +1,13 @@
1
  {
2
- "bos_token_id": 248044,
3
- "do_sample": true,
4
- "eos_token_id": [
5
- 248046,
6
- 248044
7
- ],
8
- "pad_token_id": 248044,
9
- "temperature": 1.0,
10
- "top_k": 20,
11
- "top_p": 0.95
 
12
  }
 
1
  {
2
+ "bos_token_id": 248044,
3
+ "do_sample": true,
4
+ "eos_token_id": [
5
+ 248046,
6
+ 248044
7
+ ],
8
+ "pad_token_id": 248044,
9
+ "temperature": 1.0,
10
+ "top_k": 20,
11
+ "top_p": 0.95,
12
+ "transformers_version": "5.5.4"
13
  }
model-00001-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3c136320ff6d55f94bc8fc7f26cbc52ad16d476af0b077579854adafb13bf8bd
3
+ size 2542796928
model-00002-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ae6edad4df2b3e67bfcb4221e767cf08fe2d05048e69234efcbc24d3d63c162f
3
+ size 4842451920
model-00003-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:88bda2c811178ff22ecd2217b7db58e800e0533f03be1f75d7b5e4d0ca9a2e6c
3
+ size 4965227944
model-00004-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9d64e0db67c9e9db105fa162bffa5d181c14675572ab8f80aac2d39b43ff156f
3
+ size 4912819264
model-00005-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:393f4027a4376cd059a252daef9f1fa39fae56fcb5021ded42e0971390c198fa
3
+ size 4986198544
model-00006-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b314625d7a21ebb93cb6aaa115c69182dd3458f3384733c83427c1212963107
3
+ size 4912819320
model-00007-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bbf867baa1ed213d412360486f9a81bbc6900237a6661527f06ca6e85256e8a3
3
+ size 4932703272
model-00008-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c0ee60442e4a72bc2ccfd1da158a638133eeaef2591a8f8b20fe1dc30dc4c25d
3
+ size 4966314576
model-00009-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0e2eac4ab9198053b17c0fe0f2b0fdc4b98c4ae91954e169419d50df5f555bf5
3
+ size 4964162248
model-00010-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fcc1094d280d4eb757ce0096123e8b915941c350bb192054062cdf22d82521ff
3
+ size 4933789824
model-00011-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d6352016e6b52f7793d56e7c78ff82033506126d551f77d00cc3be8bd604f25
3
+ size 4965228032
model-00012-of-00012.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:07e9ca5a9eda668ba5a9cc4e48a18349aaf42aa1b332edbb9638caa0ce0a399a
3
+ size 1867596944
model-visual-extra.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2b9b2f8c7868a88a91c2b5cefda596f7889c68259eebc62b2d7732937ea7ae5f
3
+ size 921497200
model.safetensors.index.json CHANGED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json CHANGED
@@ -9,7 +9,7 @@
9
  "eos_token": "<|im_end|>",
10
  "errors": "replace",
11
  "image_token": "<|image_pad|>",
12
- "is_local": false,
13
  "model_max_length": 262144,
14
  "model_specific_special_tokens": {
15
  "audio_bos_token": "<|audio_start|>",
@@ -29,272 +29,5 @@
29
  "unk_token": null,
30
  "video_token": "<|video_pad|>",
31
  "vision_bos_token": "<|vision_start|>",
32
- "vision_eos_token": "<|vision_end|>",
33
- "added_tokens_decoder": {
34
- "248044": {
35
- "content": "<|endoftext|>",
36
- "single_word": false,
37
- "lstrip": false,
38
- "rstrip": false,
39
- "normalized": false,
40
- "special": true
41
- },
42
- "248045": {
43
- "content": "<|im_start|>",
44
- "single_word": false,
45
- "lstrip": false,
46
- "rstrip": false,
47
- "normalized": false,
48
- "special": true
49
- },
50
- "248046": {
51
- "content": "<|im_end|>",
52
- "single_word": false,
53
- "lstrip": false,
54
- "rstrip": false,
55
- "normalized": false,
56
- "special": true
57
- },
58
- "248047": {
59
- "content": "<|object_ref_start|>",
60
- "single_word": false,
61
- "lstrip": false,
62
- "rstrip": false,
63
- "normalized": false,
64
- "special": true
65
- },
66
- "248048": {
67
- "content": "<|object_ref_end|>",
68
- "single_word": false,
69
- "lstrip": false,
70
- "rstrip": false,
71
- "normalized": false,
72
- "special": true
73
- },
74
- "248049": {
75
- "content": "<|box_start|>",
76
- "single_word": false,
77
- "lstrip": false,
78
- "rstrip": false,
79
- "normalized": false,
80
- "special": true
81
- },
82
- "248050": {
83
- "content": "<|box_end|>",
84
- "single_word": false,
85
- "lstrip": false,
86
- "rstrip": false,
87
- "normalized": false,
88
- "special": true
89
- },
90
- "248051": {
91
- "content": "<|quad_start|>",
92
- "single_word": false,
93
- "lstrip": false,
94
- "rstrip": false,
95
- "normalized": false,
96
- "special": true
97
- },
98
- "248052": {
99
- "content": "<|quad_end|>",
100
- "single_word": false,
101
- "lstrip": false,
102
- "rstrip": false,
103
- "normalized": false,
104
- "special": true
105
- },
106
- "248053": {
107
- "content": "<|vision_start|>",
108
- "single_word": false,
109
- "lstrip": false,
110
- "rstrip": false,
111
- "normalized": false,
112
- "special": true
113
- },
114
- "248054": {
115
- "content": "<|vision_end|>",
116
- "single_word": false,
117
- "lstrip": false,
118
- "rstrip": false,
119
- "normalized": false,
120
- "special": true
121
- },
122
- "248055": {
123
- "content": "<|vision_pad|>",
124
- "single_word": false,
125
- "lstrip": false,
126
- "rstrip": false,
127
- "normalized": false,
128
- "special": true
129
- },
130
- "248056": {
131
- "content": "<|image_pad|>",
132
- "single_word": false,
133
- "lstrip": false,
134
- "rstrip": false,
135
- "normalized": false,
136
- "special": true
137
- },
138
- "248057": {
139
- "content": "<|video_pad|>",
140
- "single_word": false,
141
- "lstrip": false,
142
- "rstrip": false,
143
- "normalized": false,
144
- "special": true
145
- },
146
- "248058": {
147
- "content": "<tool_call>",
148
- "single_word": false,
149
- "lstrip": false,
150
- "rstrip": false,
151
- "normalized": false,
152
- "special": false
153
- },
154
- "248059": {
155
- "content": "</tool_call>",
156
- "single_word": false,
157
- "lstrip": false,
158
- "rstrip": false,
159
- "normalized": false,
160
- "special": false
161
- },
162
- "248060": {
163
- "content": "<|fim_prefix|>",
164
- "single_word": false,
165
- "lstrip": false,
166
- "rstrip": false,
167
- "normalized": false,
168
- "special": false
169
- },
170
- "248061": {
171
- "content": "<|fim_middle|>",
172
- "single_word": false,
173
- "lstrip": false,
174
- "rstrip": false,
175
- "normalized": false,
176
- "special": false
177
- },
178
- "248062": {
179
- "content": "<|fim_suffix|>",
180
- "single_word": false,
181
- "lstrip": false,
182
- "rstrip": false,
183
- "normalized": false,
184
- "special": false
185
- },
186
- "248063": {
187
- "content": "<|fim_pad|>",
188
- "single_word": false,
189
- "lstrip": false,
190
- "rstrip": false,
191
- "normalized": false,
192
- "special": false
193
- },
194
- "248064": {
195
- "content": "<|repo_name|>",
196
- "single_word": false,
197
- "lstrip": false,
198
- "rstrip": false,
199
- "normalized": false,
200
- "special": false
201
- },
202
- "248065": {
203
- "content": "<|file_sep|>",
204
- "single_word": false,
205
- "lstrip": false,
206
- "rstrip": false,
207
- "normalized": false,
208
- "special": false
209
- },
210
- "248066": {
211
- "content": "<tool_response>",
212
- "single_word": false,
213
- "lstrip": false,
214
- "rstrip": false,
215
- "normalized": false,
216
- "special": false
217
- },
218
- "248067": {
219
- "content": "</tool_response>",
220
- "single_word": false,
221
- "lstrip": false,
222
- "rstrip": false,
223
- "normalized": false,
224
- "special": false
225
- },
226
- "248068": {
227
- "content": "<think>",
228
- "single_word": false,
229
- "lstrip": false,
230
- "rstrip": false,
231
- "normalized": false,
232
- "special": false
233
- },
234
- "248069": {
235
- "content": "</think>",
236
- "single_word": false,
237
- "lstrip": false,
238
- "rstrip": false,
239
- "normalized": false,
240
- "special": false
241
- },
242
- "248070": {
243
- "content": "<|audio_start|>",
244
- "single_word": false,
245
- "lstrip": false,
246
- "rstrip": false,
247
- "normalized": false,
248
- "special": true
249
- },
250
- "248071": {
251
- "content": "<|audio_end|>",
252
- "single_word": false,
253
- "lstrip": false,
254
- "rstrip": false,
255
- "normalized": false,
256
- "special": true
257
- },
258
- "248072": {
259
- "content": "<tts_pad>",
260
- "single_word": false,
261
- "lstrip": false,
262
- "rstrip": false,
263
- "normalized": false,
264
- "special": true
265
- },
266
- "248073": {
267
- "content": "<tts_text_bos>",
268
- "single_word": false,
269
- "lstrip": false,
270
- "rstrip": false,
271
- "normalized": false,
272
- "special": true
273
- },
274
- "248074": {
275
- "content": "<tts_text_eod>",
276
- "single_word": false,
277
- "lstrip": false,
278
- "rstrip": false,
279
- "normalized": false,
280
- "special": true
281
- },
282
- "248075": {
283
- "content": "<tts_text_bos_single>",
284
- "single_word": false,
285
- "lstrip": false,
286
- "rstrip": false,
287
- "normalized": false,
288
- "special": true
289
- },
290
- "248076": {
291
- "content": "<|audio_pad|>",
292
- "single_word": false,
293
- "lstrip": false,
294
- "rstrip": false,
295
- "normalized": false,
296
- "special": true
297
- }
298
- },
299
- "chat_template": "{%- if tools %}\n {{- '<|im_start|>system\\n' }}\n {%- if messages[0].role == 'system' %}\n {{- messages[0].content + '\\n\\n' }}\n {%- endif %}\n {{- \"# Tools\\n\\nYou may call one or more functions to assist with the user query.\\n\\nYou are provided with function signatures within <tools></tools> XML tags:\\n<tools>\" }}\n {%- for tool in tools %}\n {{- \"\\n\" }}\n {{- tool | tojson }}\n {%- endfor %}\n {{- \"\\n</tools>\\n\\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\\n<tool_call>\\n{\\\"name\\\": <function-name>, \\\"arguments\\\": <args-json-object>}\\n</tool_call><|im_end|>\\n\" }}\n{%- else %}\n {%- if messages[0].role == 'system' %}\n {{- '<|im_start|>system\\n' + messages[0].content + '<|im_end|>\\n' }}\n {%- endif %}\n{%- endif %}\n{%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}\n{%- for message in messages[::-1] %}\n {%- set index = (messages|length - 1) - loop.index0 %}\n {%- if ns.multi_step_tool and message.role == \"user\" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}\n {%- set ns.multi_step_tool = false %}\n {%- set ns.last_query_index = index %}\n {%- endif %}\n{%- endfor %}\n{%- for message in messages %}\n {%- if message.content is string %}\n {%- set content = message.content %}\n {%- else %}\n {%- set content = '' %}\n {%- endif %}\n {%- if (message.role == \"user\") or (message.role == \"system\" and not loop.first) %}\n {{- '<|im_start|>' + message.role + '\\n' + content + '<|im_end|>' + '\\n' }}\n {%- elif message.role == \"assistant\" %}\n {%- set reasoning_content = '' %}\n {%- if message.reasoning_content is string %}\n {%- set reasoning_content = message.reasoning_content %}\n {%- else %}\n {%- if '</think>' in content %}\n {%- set reasoning_content = content.split('</think>')[0].rstrip('\\n').split('<think>')[-1].lstrip('\\n') %}\n {%- set content = content.split('</think>')[-1].lstrip('\\n') %}\n {%- endif %}\n {%- endif %}\n {%- if loop.index0 > ns.last_query_index %}\n {%- if loop.last or (not loop.last and reasoning_content) %}\n {{- '<|im_start|>' + message.role + '\\n<think>\\n' + reasoning_content.strip('\\n') + '\\n</think>\\n\\n' + content.lstrip('\\n') }}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- else %}\n {{- '<|im_start|>' + message.role + '\\n' + content }}\n {%- endif %}\n {%- if message.tool_calls %}\n {%- for tool_call in message.tool_calls %}\n {%- if (loop.first and content) or (not loop.first) %}\n {{- '\\n' }}\n {%- endif %}\n {%- if tool_call.function %}\n {%- set tool_call = tool_call.function %}\n {%- endif %}\n {{- '<tool_call>\\n{\"name\": \"' }}\n {{- tool_call.name }}\n {{- '\", \"arguments\": ' }}\n {%- if tool_call.arguments is string %}\n {{- tool_call.arguments }}\n {%- else %}\n {{- tool_call.arguments | tojson }}\n {%- endif %}\n {{- '}\\n</tool_call>' }}\n {%- endfor %}\n {%- endif %}\n {{- '<|im_end|>\\n' }}\n {%- elif message.role == \"tool\" %}\n {%- if loop.first or (messages[loop.index0 - 1].role != \"tool\") %}\n {{- '<|im_start|>user' }}\n {%- endif %}\n {{- '\\n<tool_response>\\n' }}\n {{- content }}\n {{- '\\n</tool_response>' }}\n {%- if loop.last or (messages[loop.index0 + 1].role != \"tool\") %}\n {{- '<|im_end|>\\n' }}\n {%- endif %}\n {%- endif %}\n{%- endfor %}\n{%- if add_generation_prompt %}\n {{- '<|im_start|>assistant\n<think>\n' }}\n{%- endif %}"
300
- }
 
9
  "eos_token": "<|im_end|>",
10
  "errors": "replace",
11
  "image_token": "<|image_pad|>",
12
+ "is_local": true,
13
  "model_max_length": 262144,
14
  "model_specific_special_tokens": {
15
  "audio_bos_token": "<|audio_start|>",
 
29
  "unk_token": null,
30
  "video_token": "<|video_pad|>",
31
  "vision_bos_token": "<|vision_start|>",
32
+ "vision_eos_token": "<|vision_end|>"
33
+ }