Osaurus-AI commited on
Commit
9e6f5b7
·
verified ·
1 Parent(s): a59493f

DSV4-Flash JANGTQ2 (uniform 2-bit, --variant std, MTP preserved-disabled)

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. .gitattributes +2 -0
  2. DeepSeek_V4.pdf +3 -0
  3. LICENSE +21 -0
  4. README.md +117 -0
  5. config.json +145 -0
  6. encoding/README.md +156 -0
  7. encoding/__pycache__/encoding_dsv4.cpython-312.pyc +0 -0
  8. encoding/__pycache__/encoding_dsv4.cpython-313.pyc +0 -0
  9. encoding/encoding_dsv4.py +744 -0
  10. encoding/test_encoding_dsv4.py +89 -0
  11. encoding/tests/test_input_1.json +81 -0
  12. encoding/tests/test_input_2.json +24 -0
  13. encoding/tests/test_input_3.json +159 -0
  14. encoding/tests/test_input_4.json +28 -0
  15. encoding/tests/test_output_1.txt +36 -0
  16. encoding/tests/test_output_2.txt +1 -0
  17. encoding/tests/test_output_3.txt +38 -0
  18. encoding/tests/test_output_4.txt +29 -0
  19. generation_config.json +12 -0
  20. jang_config.json +101 -0
  21. jangtq_runtime.safetensors +3 -0
  22. model-00001-of-00075.safetensors +3 -0
  23. model-00002-of-00075.safetensors +3 -0
  24. model-00003-of-00075.safetensors +3 -0
  25. model-00004-of-00075.safetensors +3 -0
  26. model-00005-of-00075.safetensors +3 -0
  27. model-00006-of-00075.safetensors +3 -0
  28. model-00007-of-00075.safetensors +3 -0
  29. model-00008-of-00075.safetensors +3 -0
  30. model-00009-of-00075.safetensors +3 -0
  31. model-00010-of-00075.safetensors +3 -0
  32. model-00011-of-00075.safetensors +3 -0
  33. model-00012-of-00075.safetensors +3 -0
  34. model-00013-of-00075.safetensors +3 -0
  35. model-00014-of-00075.safetensors +3 -0
  36. model-00015-of-00075.safetensors +3 -0
  37. model-00016-of-00075.safetensors +3 -0
  38. model-00017-of-00075.safetensors +3 -0
  39. model-00018-of-00075.safetensors +3 -0
  40. model-00019-of-00075.safetensors +3 -0
  41. model-00020-of-00075.safetensors +3 -0
  42. model-00021-of-00075.safetensors +3 -0
  43. model-00022-of-00075.safetensors +3 -0
  44. model-00023-of-00075.safetensors +3 -0
  45. model-00024-of-00075.safetensors +3 -0
  46. model-00025-of-00075.safetensors +3 -0
  47. model-00026-of-00075.safetensors +3 -0
  48. model-00027-of-00075.safetensors +3 -0
  49. model-00028-of-00075.safetensors +3 -0
  50. model-00029-of-00075.safetensors +3 -0
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ *.pdf filter=lfs diff=lfs merge=lfs -text
37
+ *.png filter=lfs diff=lfs merge=lfs -text
DeepSeek_V4.pdf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f4cbe4fcbd2888b25b2890a98cc6ef4ce0489df7c93e140b6f853c451d3f5c52
3
+ size 4479907
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2023 DeepSeek
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ license_name: deepseek-license
4
+ library_name: mlx
5
+ base_model: deepseek-ai/DeepSeek-V4-Flash
6
+ base_model_relation: quantized
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - mlx
10
+ - jang
11
+ - jangtq
12
+ - jangtq2
13
+ - jangtq-prestack
14
+ - mxtq
15
+ - deepseek
16
+ - deepseek-v4
17
+ - deepseek-v4-flash
18
+ - moe
19
+ - mla
20
+ - hash-layers
21
+ - mtp
22
+ - apple-silicon
23
+ - osaurus
24
+ ---
25
+
26
+ <p align="center"><img src="osaurus-x-banner.png" width="100%" alt="OsaurusAI"/></p>
27
+
28
+ # DeepSeek-V4-Flash-JANGTQ2
29
+
30
+ **DeepSeek-V4-Flash — 79.6 GB on disk** (down from 149 GB FP4+FP8 source) —
31
+ uniform **2-bit JANGTQ** quantization on routed experts + 8-bit affine on
32
+ everything else + preserved MTP head.
33
+
34
+ - **Source:** [deepseek-ai/DeepSeek-V4-Flash](https://huggingface.co/deepseek-ai/DeepSeek-V4-Flash)
35
+ (43 transformer layers + 1 MTP head, **256 routed experts top-6 + 1
36
+ shared expert**, **3 hash layers**, MLA + mHC residuals, ~284 B total)
37
+ - **Quantization:** uniform **2-bit MXTQ** on routed-expert MLP +
38
+ 8-bit affine on attention (`wq_a/wq_b/wkv/wo_a/wo_b`) / shared
39
+ expert / Compressor / Indexer / embed / lm_head / MTP. RMSNorms,
40
+ router gate, mHC fn matrices, attn_sink, ape stay fp16/fp32
41
+ passthrough.
42
+ - **Variant:** `std` (preserves MTP layer 43; one-token-per-forward
43
+ until a JANG runtime ships the accept/reject speculative-decode loop).
44
+ The companion `DeepSeek-V4-Flash-JANGTQ-K` variant drops MTP for a
45
+ smaller bundle.
46
+ - **Routed-expert layout:** **pre-stacked** along axis 0 under
47
+ `ffn.experts.switch_mlp.{{gate_proj, up_proj, down_proj}}` per the
48
+ JANGTQ-PRESTACK STANDARD. Sidecar `jangtq_runtime.safetensors`
49
+ (~24 KB) ships both `(in=2048, bits=2)` and `(in=4096, bits=2)`
50
+ codebooks + sign-flip vectors for Swift runtimes.
51
+ - **Bundle size:** **~79.6 GB on-disk**
52
+ - **Runs on:** M4 Max 128 GB / M5 Max 128 GB / Mac Studio 192 GB+
53
+
54
+ ## Why top-6 + 2-bit holds
55
+
56
+ DSV4-Flash routes through **6 of 256 experts per token** plus 1 always-on
57
+ shared expert and 3 hash layers — so per-token output averages
58
+ codebook noise across 7+ pathways. That's a much weaker quality
59
+ constraint than top-1 architectures (where every token rides a single
60
+ expert's quant error). MiniMax (top-2) and Hy3-preview (top-8) both
61
+ ship coherent uniform JANGTQ2; DSV4 sits between them.
62
+
63
+ ## Loading (Python)
64
+
65
+ ```bash
66
+ pip install jang-tools mlx-lm
67
+ ```
68
+
69
+ ```python
70
+ from jang_tools.load_jangtq import load_jangtq_model
71
+
72
+ model, tokenizer = load_jangtq_model("OsaurusAI/DeepSeek-V4-Flash-JANGTQ2")
73
+
74
+ chat = tokenizer.apply_chat_template(
75
+ [{{"role": "user", "content": "What is 2 + 2? Answer briefly."}}],
76
+ tokenize=False,
77
+ add_generation_prompt=True,
78
+ )
79
+ ```
80
+
81
+ `load_jangtq_model` auto-registers `model_type=deepseek_v4` via
82
+ `jang_tools.dsv4` before building the MLX skeleton. The loader applies
83
+ the DSV4-specific MLA absorb + fp32 SDPA + mHC + Compressor + Indexer
84
+ patches automatically.
85
+
86
+ ## Runtime support matrix
87
+
88
+ | Surface | Status |
89
+ |---|---|
90
+ | `jang-tools` Python (`load_jangtq_model`) | ✅ working |
91
+ | `vmlx-swift-lm` Swift | ✅ working — `DeepseekV4JANGTQ` family path |
92
+ | MTP speculative decode | preserved-disabled — weights present (variant=std); accept/reject loop not yet in any JANG runtime |
93
+
94
+ ## Validated runtime contract
95
+
96
+ - 43 transformer layers + 1 MTP head materialize; 40 sparse-MoE layers
97
+ hydrate routed experts via TurboQuantLinear (2-bit MXTQ).
98
+ - 33,792 MXTQ tensors / 522 affine / 706 passthrough.
99
+ - Capabilities: `family=deepseek_v4`, `reasoning_parser=deepseek_r1`,
100
+ `tool_parser=dsml`, `think_in_template=True`, `cache_type=mla`.
101
+
102
+ ## Reasoning + tools
103
+
104
+ - **Reasoning parser:** `deepseek_r1`
105
+ - **Tool parser:** `dsml` (DeepSeek Markup Language — distinct from
106
+ `deepseek_tool_parser`; see `~/jang/research/DSV4-EVAL-NUANCES.md`)
107
+ - **Reasoning template:** `<|thinking_begin|>...<|thinking_end|>` blocks
108
+ via `enable_thinking=True` (default off — pass-through chat mode).
109
+ Greedy `T=0` with `enable_thinking=True` collapses into repetition on
110
+ DSV4; use `T=0.6` for pass@1 like the original DeepSeek release.
111
+ - **Cache:** `mla` (Multi-head Latent Attention with kv_lora_rank=512)
112
+
113
+ ## Credits
114
+
115
+ - **Quantization + MLX runtime:** Jinho Jang (eric@osaurus.ai)
116
+ - **Source model:** DeepSeek AI
117
+ - **License:** MIT, inherited from upstream
config.json ADDED
@@ -0,0 +1,145 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "DeepseekV4ForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": [
9
+ 1,
10
+ 128803
11
+ ],
12
+ "expert_dtype": "fp4",
13
+ "hc_eps": 1e-06,
14
+ "hc_mult": 4,
15
+ "hc_sinkhorn_iters": 20,
16
+ "head_dim": 512,
17
+ "hidden_act": "silu",
18
+ "hidden_size": 4096,
19
+ "index_head_dim": 128,
20
+ "index_n_heads": 64,
21
+ "index_topk": 512,
22
+ "initializer_range": 0.02,
23
+ "max_position_embeddings": 1048576,
24
+ "model_type": "deepseek_v4",
25
+ "moe_intermediate_size": 2048,
26
+ "n_routed_experts": 256,
27
+ "n_shared_experts": 1,
28
+ "norm_topk_prob": true,
29
+ "num_attention_heads": 64,
30
+ "num_experts_per_tok": 6,
31
+ "num_hidden_layers": 43,
32
+ "num_hash_layers": 3,
33
+ "num_key_value_heads": 1,
34
+ "num_nextn_predict_layers": 1,
35
+ "o_groups": 8,
36
+ "o_lora_rank": 1024,
37
+ "q_lora_rank": 1024,
38
+ "qk_rope_head_dim": 64,
39
+ "rms_norm_eps": 1e-06,
40
+ "rope_theta": 10000,
41
+ "routed_scaling_factor": 1.5,
42
+ "scoring_func": "sqrtsoftplus",
43
+ "sliding_window": 128,
44
+ "swiglu_limit": 10.0,
45
+ "tie_word_embeddings": false,
46
+ "topk_method": "noaux_tc",
47
+ "torch_dtype": "bfloat16",
48
+ "transformers_version": "4.57.1",
49
+ "use_cache": true,
50
+ "vocab_size": 129280,
51
+ "compress_rope_theta": 160000,
52
+ "compress_ratios": [
53
+ 0,
54
+ 0,
55
+ 4,
56
+ 128,
57
+ 4,
58
+ 128,
59
+ 4,
60
+ 128,
61
+ 4,
62
+ 128,
63
+ 4,
64
+ 128,
65
+ 4,
66
+ 128,
67
+ 4,
68
+ 128,
69
+ 4,
70
+ 128,
71
+ 4,
72
+ 128,
73
+ 4,
74
+ 128,
75
+ 4,
76
+ 128,
77
+ 4,
78
+ 128,
79
+ 4,
80
+ 128,
81
+ 4,
82
+ 128,
83
+ 4,
84
+ 128,
85
+ 4,
86
+ 128,
87
+ 4,
88
+ 128,
89
+ 4,
90
+ 128,
91
+ 4,
92
+ 128,
93
+ 4,
94
+ 128,
95
+ 4,
96
+ 0
97
+ ],
98
+ "rope_parameters": {
99
+ "beta_fast": 32.0,
100
+ "beta_slow": 1.0,
101
+ "factor": 16.0,
102
+ "original_max_position_embeddings": 65536,
103
+ "rope_type": "yarn",
104
+ "rope_theta": 10000.0
105
+ },
106
+ "quantization": {
107
+ "bits": 8,
108
+ "group_size": 32,
109
+ "mode": "affine",
110
+ "routed_expert_bits": 2,
111
+ "routed_expert_bit_plan": {
112
+ "default_bits": 2,
113
+ "codec": "mxtq"
114
+ },
115
+ "mxtq_bits": {
116
+ "routed_expert": 2,
117
+ "attention": 8,
118
+ "shared_expert": 8,
119
+ "compressor": 8,
120
+ "indexer": 8,
121
+ "embed_tokens": 8,
122
+ "lm_head": 8,
123
+ "norms_router_hc": 16
124
+ }
125
+ },
126
+ "weight_format": "mxtq",
127
+ "routed_expert_bits": 2,
128
+ "routed_expert_bit_plan": {
129
+ "default_bits": 2,
130
+ "codec": "mxtq"
131
+ },
132
+ "mxtq_bits": {
133
+ "routed_expert": 2,
134
+ "attention": 8,
135
+ "shared_expert": 8,
136
+ "compressor": 8,
137
+ "indexer": 8,
138
+ "embed_tokens": 8,
139
+ "lm_head": 8,
140
+ "norms_router_hc": 16
141
+ },
142
+ "mxtq_seed": 42,
143
+ "group_size": 32,
144
+ "_name_or_path": "DSV4-Flash-JANGTQ2"
145
+ }
encoding/README.md ADDED
@@ -0,0 +1,156 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # DeepSeek-V4 Encoding
2
+
3
+ This document describes the prompt encoding format used by DeepSeek-V4 series models. The encoding handles multi-turn conversations, tool calling, extended thinking (reasoning), and quick instruction tasks.
4
+
5
+ A self-contained reference implementation is provided in `encoding_dsv4.py`.
6
+
7
+ ## Quick Start
8
+
9
+ ```python
10
+ from encoding_dsv4 import encode_messages, parse_message_from_completion_text
11
+
12
+ # Encode a conversation
13
+ messages = [
14
+ {"role": "system", "content": "You are a helpful assistant."},
15
+ {"role": "user", "content": "What is 2+2?"},
16
+ ]
17
+ prompt = encode_messages(messages, thinking_mode="thinking")
18
+ # => "<|begin▁of▁sentence|>You are a helpful assistant.<|User|>What is 2+2?<|Assistant|><think>"
19
+
20
+ # Parse model output back to structured message
21
+ completion = "Simple arithmetic.</think>2 + 2 = 4.<|end▁of▁sentence|>"
22
+ parsed = parse_message_from_completion_text(completion, thinking_mode="thinking")
23
+ # => {"role": "assistant", "reasoning_content": "Simple arithmetic.", "content": "2 + 2 = 4.", "tool_calls": []}
24
+ ```
25
+
26
+ > **Note:** The `parse_message_from_completion_text` function is designed to handle well-formatted model output only. It does not attempt to correct or recover from malformed output that the model might occasionally generate. For production use, additional error handling is recommended.
27
+
28
+ ## Message Format
29
+
30
+ ### Special Tokens
31
+
32
+ | Token | Purpose |
33
+ |-------|---------|
34
+ | `<|begin▁of▁sentence|>` | Beginning of sequence (BOS) |
35
+ | `<|end▁of▁sentence|>` | End of assistant turn (EOS) |
36
+ | `<|User|>` | User turn prefix |
37
+ | `<|Assistant|>` | Assistant turn prefix |
38
+ | `<|latest_reminder|>` | Latest reminder (date, locale, etc.) |
39
+ | `<think>` / `</think>` | Reasoning block delimiters |
40
+ | `|DSML|` | DSML markup token |
41
+
42
+ ### Roles
43
+
44
+ The encoding supports the following message roles: `system`, `user`, `assistant`, `tool`, `latest_reminder`, and `developer`.
45
+
46
+ > **Note on the `developer` role:** The `developer` role is used exclusively in the internal search agent pipeline. It is not needed for general-purpose chat or tool-calling tasks, and the official API does not accept messages with this role.
47
+
48
+ ### Basic Chat
49
+
50
+ A simple multi-turn conversation is encoded as:
51
+
52
+ ```
53
+ <|begin▁of▁sentence|>{system_prompt}
54
+ <|User|>{user_message}<|Assistant|></think>{response}<|end▁of▁sentence|>
55
+ <|User|>{user_message_2}<|Assistant|></think>{response_2}<|end▁of▁sentence|>
56
+ ```
57
+
58
+ - The BOS token is prepended at the very beginning of the conversation.
59
+ - In **chat mode** (`thinking_mode="chat"`), `</think>` is placed right after `<|Assistant|>` to immediately close the thinking block, so the model generates content directly.
60
+
61
+ ### Interleaved Thinking Mode
62
+
63
+ In **thinking mode** (`thinking_mode="thinking"`), the model produces explicit reasoning inside `<think>...</think>` blocks before responding.
64
+
65
+ ```
66
+ <|begin▁of▁sentence|>{system_prompt}
67
+ <|User|>{message}<|Assistant|><think>{reasoning}</think>{response}<|end▁of▁sentence|>
68
+ ```
69
+
70
+ The `drop_thinking` parameter (default `True`) controls whether reasoning from earlier turns is preserved:
71
+
72
+ - **Without tools**: `drop_thinking` takes effect. Reasoning content from assistant turns **before** the last user message is stripped. Only the final assistant turn retains its `<think>...</think>` block.
73
+ - **With tools** (on system or developer message): `drop_thinking` is automatically disabled. All turns retain their reasoning, because tool-calling conversations require full context for the model to track multi-step reasoning across tool calls.
74
+
75
+ ### Tool Calling (DSML Format)
76
+
77
+ Tools are defined on the `system` or `developer` message via the `tools` field (OpenAI-compatible format). When tools are present, the following schema block is injected into the system/user prompt:
78
+
79
+ ```
80
+ ## Tools
81
+
82
+ You have access to a set of tools to help answer the user's question. You can invoke tools by writing a "<|DSML|tool_calls>" block like the following:
83
+
84
+ <|DSML|tool_calls>
85
+ <|DSML|invoke name="$TOOL_NAME">
86
+ <|DSML|parameter name="$PARAMETER_NAME" string="true|false">$PARAMETER_VALUE</|DSML|parameter>
87
+ ...
88
+ </|DSML|invoke>
89
+ <|DSML|invoke name="$TOOL_NAME2">
90
+ ...
91
+ </|DSML|invoke>
92
+ </|DSML|tool_calls>
93
+
94
+ String parameters should be specified as is and set `string="true"`. For all other types (numbers, booleans, arrays, objects), pass the value in JSON format and set `string="false"`.
95
+
96
+ If thinking_mode is enabled (triggered by <think>), you MUST output your complete reasoning inside <think>...</think> BEFORE any tool calls or final response.
97
+
98
+ Otherwise, output directly after </think> with tool calls or final response.
99
+
100
+ ### Available Tool Schemas
101
+
102
+ {tool_definitions_json}
103
+
104
+ You MUST strictly follow the above defined tool name and parameter schemas to invoke tool calls.
105
+ ```
106
+
107
+ An actual tool call in the assistant turn looks like:
108
+
109
+ ```xml
110
+ <|DSML|tool_calls>
111
+ <|DSML|invoke name="function_name">
112
+ <|DSML|parameter name="param" string="true">string_value</|DSML|parameter>
113
+ <|DSML|parameter name="count" string="false">5</|DSML|parameter>
114
+ </|DSML|invoke>
115
+ </|DSML|tool_calls><|end▁of▁sentence|>
116
+ ```
117
+
118
+ - `string="true"`: the parameter value is a raw string.
119
+ - `string="false"`: the parameter value is JSON (number, boolean, array, object).
120
+
121
+ Tool execution results are wrapped in `<tool_result>` tags within user messages:
122
+
123
+ ```
124
+ <|User|><tool_result>{result_json}</tool_result><|Assistant|><think>...
125
+ ```
126
+
127
+ When multiple tool results are present, they are sorted by the order of the corresponding `tool_calls` in the preceding assistant message.
128
+
129
+ ### Reasoning Effort
130
+
131
+ When `reasoning_effort="max"` is set, a special prefix is prepended at the very beginning of the prompt (before the system message) to instruct the model to maximize its reasoning depth:
132
+
133
+ ```
134
+ Reasoning Effort: Absolute maximum with no shortcuts permitted.
135
+ You MUST be very thorough in your thinking and comprehensively decompose the problem to resolve the root cause, rigorously stress-testing your logic against all potential paths, edge cases, and adversarial scenarios.
136
+ Explicitly write out your entire deliberation process, documenting every intermediate step, considered alternative, and rejected hypothesis to ensure absolutely no assumption is left unchecked.
137
+ ```
138
+
139
+ ### Quick Instruction Special Tokens
140
+
141
+ Quick instruction tokens are used for auxiliary classification and generation tasks. They are appended to messages via the `"task"` field to trigger specialized model behavior for a single-token or short-form output.
142
+
143
+ | Special Token | Description | Format |
144
+ |:---|:---|:---|
145
+ | `<|action|>` | Determines whether the user prompt requires a web search or can be answered directly. | `...<|User|>{prompt}<|Assistant|><think><|action|>` |
146
+ | `<|title|>` | Generates a concise conversation title after the first assistant response. | `...<|Assistant|>{response}<|end▁of▁sentence|><|title|>` |
147
+ | `<|query|>` | Generates search queries for the user prompt. | `...<|User|>{prompt}<|query|>` |
148
+ | `<|authority|>` | Classifies the user prompt's demand for source authoritativeness. | `...<|User|>{prompt}<|authority|>` |
149
+ | `<|domain|>` | Identifies the domain of the user prompt. | `...<|User|>{prompt}<|domain|>` |
150
+ | `<|extracted_url|>` `<|read_url|>` | Determines whether each URL in the user prompt should be fetched and read. | `...<|User|>{prompt}<|extracted_url|>{url}<|read_url|>` |
151
+
152
+ Usage in message format:
153
+
154
+ - **`action`** on a user message: the `<|action|>` token is placed after the assistant prefix and thinking token, triggering a routing decision (e.g., "Search" or "Answer").
155
+ - **Other tasks** (`query`, `authority`, `domain`, `read_url`) on a user message: the task token is appended directly after the user content.
156
+ - **`title`** on an assistant message: the `<|title|>` token is appended after the assistant's EOS. The next assistant message provides the generated title.
encoding/__pycache__/encoding_dsv4.cpython-312.pyc ADDED
Binary file (28.6 kB). View file
 
encoding/__pycache__/encoding_dsv4.cpython-313.pyc ADDED
Binary file (28.6 kB). View file
 
encoding/encoding_dsv4.py ADDED
@@ -0,0 +1,744 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ DeepSeek-V4 Encoding
3
+
4
+ A self-contained implementation for encoding/decoding DeepSeek-V4 chat messages
5
+ with tool calling, thinking mode, and quick instruction task support.
6
+ """
7
+
8
+ from typing import Any, Dict, List, Union, Optional, Tuple
9
+ import copy
10
+ import json
11
+ import re
12
+
13
+ # ============================================================
14
+ # Special Tokens
15
+ # ============================================================
16
+
17
+ bos_token: str = "<|begin▁of▁sentence|>"
18
+ eos_token: str = "<|end▁of▁sentence|>"
19
+ thinking_start_token: str = "<think>"
20
+ thinking_end_token: str = "</think>"
21
+ dsml_token: str = "|DSML|"
22
+
23
+ USER_SP_TOKEN = "<|User|>"
24
+ ASSISTANT_SP_TOKEN = "<|Assistant|>"
25
+ LATEST_REMINDER_SP_TOKEN = "<|latest_reminder|>"
26
+
27
+ # Task special tokens for internal classification tasks
28
+ DS_TASK_SP_TOKENS = {
29
+ "action": "<|action|>",
30
+ "query": "<|query|>",
31
+ "authority": "<|authority|>",
32
+ "domain": "<|domain|>",
33
+ "title": "<|title|>",
34
+ "read_url": "<|read_url|>",
35
+ }
36
+ VALID_TASKS = set(DS_TASK_SP_TOKENS.keys())
37
+
38
+ # ============================================================
39
+ # Templates
40
+ # ============================================================
41
+
42
+ system_msg_template: str = "{content}"
43
+ user_msg_template: str = "{content}"
44
+ latest_reminder_msg_template: str = "{content}"
45
+ assistant_msg_template: str = "{reasoning}{content}{tool_calls}" + eos_token
46
+ assistant_msg_wo_eos_template: str = "{reasoning}{content}{tool_calls}"
47
+ thinking_template: str = "{reasoning_content}"
48
+
49
+ response_format_template: str = (
50
+ "## Response Format:\n\nYou MUST strictly adhere to the following schema to reply:\n{schema}"
51
+ )
52
+ tool_call_template: str = (
53
+ "<{dsml_token}invoke name=\"{name}\">\n{arguments}\n</{dsml_token}invoke>"
54
+ )
55
+ tool_calls_template = (
56
+ "<{dsml_token}{tc_block_name}>\n{tool_calls}\n</{dsml_token}{tc_block_name}>"
57
+ )
58
+ tool_calls_block_name: str = "tool_calls"
59
+
60
+ tool_output_template: str = (
61
+ "<tool_result>{content}</tool_result>"
62
+ )
63
+
64
+ REASONING_EFFORT_MAX = (
65
+ "Reasoning Effort: Absolute maximum with no shortcuts permitted.\n"
66
+ "You MUST be very thorough in your thinking and comprehensively decompose the problem to resolve the root cause, rigorously stress-testing your logic against all potential paths, edge cases, and adversarial scenarios.\n"
67
+ "Explicitly write out your entire deliberation process, documenting every intermediate step, considered alternative, and rejected hypothesis to ensure absolutely no assumption is left unchecked.\n\n"
68
+ )
69
+
70
+ TOOLS_TEMPLATE = """## Tools
71
+
72
+ You have access to a set of tools to help answer the user's question. You can invoke tools by writing a "<{dsml_token}tool_calls>" block like the following:
73
+
74
+ <{dsml_token}tool_calls>
75
+ <{dsml_token}invoke name="$TOOL_NAME">
76
+ <{dsml_token}parameter name="$PARAMETER_NAME" string="true|false">$PARAMETER_VALUE</{dsml_token}parameter>
77
+ ...
78
+ </{dsml_token}invoke>
79
+ <{dsml_token}invoke name="$TOOL_NAME2">
80
+ ...
81
+ </{dsml_token}invoke>
82
+ </{dsml_token}tool_calls>
83
+
84
+ String parameters should be specified as is and set `string="true"`. For all other types (numbers, booleans, arrays, objects), pass the value in JSON format and set `string="false"`.
85
+
86
+ If thinking_mode is enabled (triggered by {thinking_start_token}), you MUST output your complete reasoning inside {thinking_start_token}...{thinking_end_token} BEFORE any tool calls or final response.
87
+
88
+ Otherwise, output directly after {thinking_end_token} with tool calls or final response.
89
+
90
+ ### Available Tool Schemas
91
+
92
+ {tool_schemas}
93
+
94
+ You MUST strictly follow the above defined tool name and parameter schemas to invoke tool calls.
95
+ """
96
+
97
+ # ============================================================
98
+ # Utility Functions
99
+ # ============================================================
100
+
101
+ def to_json(value: Any) -> str:
102
+ """Serialize a value to JSON string."""
103
+ try:
104
+ return json.dumps(value, ensure_ascii=False)
105
+ except:
106
+ return json.dumps(value, ensure_ascii=True)
107
+
108
+
109
+ def tools_from_openai_format(tools):
110
+ """Extract function definitions from OpenAI-format tool list."""
111
+ return [tool["function"] for tool in tools]
112
+
113
+
114
+ def tool_calls_from_openai_format(tool_calls):
115
+ """Convert OpenAI-format tool calls to internal format."""
116
+ return [
117
+ {
118
+ "name": tool_call["function"]["name"],
119
+ "arguments": tool_call["function"]["arguments"],
120
+ }
121
+ for tool_call in tool_calls
122
+ ]
123
+
124
+
125
+ def tool_calls_to_openai_format(tool_calls):
126
+ """Convert internal tool calls to OpenAI format."""
127
+ return [
128
+ {
129
+ "type": "function",
130
+ "function": {
131
+ "name": tool_call["name"],
132
+ "arguments": tool_call["arguments"],
133
+ }
134
+ }
135
+ for tool_call in tool_calls
136
+ ]
137
+
138
+
139
+ def encode_arguments_to_dsml(tool_call: Dict[str, str]) -> str:
140
+ """
141
+ Encode tool call arguments into DSML parameter format.
142
+
143
+ Args:
144
+ tool_call: Dict with "name" and "arguments" (JSON string) keys.
145
+
146
+ Returns:
147
+ DSML-formatted parameter string.
148
+ """
149
+ p_dsml_template = '<{dsml_token}parameter name="{key}" string="{is_str}">{value}</{dsml_token}parameter>'
150
+ P_dsml_strs = []
151
+
152
+ try:
153
+ arguments = json.loads(tool_call["arguments"])
154
+ except Exception as err:
155
+ arguments = {"arguments": tool_call["arguments"]}
156
+
157
+ for k, v in arguments.items():
158
+ p_dsml_str = p_dsml_template.format(
159
+ dsml_token=dsml_token,
160
+ key=k,
161
+ is_str="true" if isinstance(v, str) else "false",
162
+ value=v if isinstance(v, str) else to_json(v),
163
+ )
164
+ P_dsml_strs.append(p_dsml_str)
165
+
166
+ return "\n".join(P_dsml_strs)
167
+
168
+
169
+ def decode_dsml_to_arguments(tool_name: str, tool_args: Dict[str, Tuple[str, str]]) -> Dict[str, str]:
170
+ """
171
+ Decode DSML parameters back to a tool call dict.
172
+
173
+ Args:
174
+ tool_name: Name of the tool.
175
+ tool_args: Dict mapping param_name -> (value, is_string_flag).
176
+
177
+ Returns:
178
+ Dict with "name" and "arguments" (JSON string) keys.
179
+ """
180
+ def _decode_value(key: str, value: str, string: str):
181
+ if string == "true":
182
+ value = to_json(value)
183
+ return f"{to_json(key)}: {value}"
184
+
185
+ tool_args_json = "{" + ", ".join([_decode_value(k, v, string=is_str) for k, (v, is_str) in tool_args.items()]) + "}"
186
+ return dict(name=tool_name, arguments=tool_args_json)
187
+
188
+
189
+ def render_tools(tools: List[Dict[str, Union[str, Dict[str, Any]]]]) -> str:
190
+ """
191
+ Render tool schemas into the system prompt format.
192
+
193
+ Args:
194
+ tools: List of tool schema dicts (each with name, description, parameters).
195
+
196
+ Returns:
197
+ Formatted tools section string.
198
+ """
199
+ tools_json = [to_json(t) for t in tools]
200
+
201
+ return TOOLS_TEMPLATE.format(
202
+ tool_schemas="\n".join(tools_json),
203
+ dsml_token=dsml_token,
204
+ thinking_start_token=thinking_start_token,
205
+ thinking_end_token=thinking_end_token,
206
+ )
207
+
208
+
209
+ def find_last_user_index(messages: List[Dict[str, Any]]) -> int:
210
+ """Find the index of the last user/developer message."""
211
+ last_user_index = -1
212
+ for idx in range(len(messages) - 1, -1, -1):
213
+ if messages[idx].get("role") in ["user", "developer"]:
214
+ last_user_index = idx
215
+ break
216
+ return last_user_index
217
+
218
+
219
+ # ============================================================
220
+ # Message Rendering
221
+ # ============================================================
222
+
223
+ def render_message(index: int, messages: List[Dict[str, Any]], thinking_mode: str, drop_thinking: bool = True, reasoning_effort: Optional[str] = None) -> str:
224
+ """
225
+ Render a single message at the given index into its encoded string form.
226
+
227
+ This is the core function that converts each message in the conversation
228
+ into the DeepSeek-V4 format.
229
+
230
+ Args:
231
+ index: Index of the message to render.
232
+ messages: Full list of messages in the conversation.
233
+ thinking_mode: Either "chat" or "thinking".
234
+ drop_thinking: Whether to drop reasoning content from earlier turns.
235
+ reasoning_effort: Optional reasoning effort level ("max", "high", or None).
236
+
237
+ Returns:
238
+ Encoded string for this message.
239
+ """
240
+ assert 0 <= index < len(messages)
241
+ assert thinking_mode in ["chat", "thinking"], f"Invalid thinking_mode `{thinking_mode}`"
242
+
243
+ prompt = ""
244
+ msg = messages[index]
245
+ last_user_idx = find_last_user_index(messages)
246
+
247
+ role = msg.get("role")
248
+ content = msg.get("content")
249
+ tools = msg.get("tools")
250
+ response_format = msg.get("response_format")
251
+ tool_calls = msg.get("tool_calls")
252
+ reasoning_content = msg.get("reasoning_content")
253
+ wo_eos = msg.get("wo_eos", False)
254
+
255
+ if tools:
256
+ tools = tools_from_openai_format(tools)
257
+ if tool_calls:
258
+ tool_calls = tool_calls_from_openai_format(tool_calls)
259
+
260
+ # Reasoning effort prefix (only at index 0 in thinking mode with max effort)
261
+ assert reasoning_effort in ['max', None, 'high'], f"Invalid reasoning effort: {reasoning_effort}"
262
+ if index == 0 and thinking_mode == "thinking" and reasoning_effort == 'max':
263
+ prompt += REASONING_EFFORT_MAX
264
+
265
+ if role == "system":
266
+ prompt += system_msg_template.format(content=content or "")
267
+ if tools:
268
+ prompt += "\n\n" + render_tools(tools)
269
+ if response_format:
270
+ prompt += "\n\n" + response_format_template.format(schema=to_json(response_format))
271
+
272
+ elif role == "developer":
273
+ assert content, f"Invalid message for role `{role}`: {msg}"
274
+
275
+ content_developer = USER_SP_TOKEN
276
+ content_developer += content
277
+
278
+ if tools:
279
+ content_developer += "\n\n" + render_tools(tools)
280
+ if response_format:
281
+ content_developer += "\n\n" + response_format_template.format(schema=to_json(response_format))
282
+
283
+ prompt += user_msg_template.format(content=content_developer)
284
+
285
+ elif role == "user":
286
+ prompt += USER_SP_TOKEN
287
+
288
+ # Handle content blocks (tool results mixed with text)
289
+ content_blocks = msg.get("content_blocks")
290
+ if content_blocks:
291
+ parts = []
292
+ for block in content_blocks:
293
+ block_type = block.get("type")
294
+ if block_type == "text":
295
+ parts.append(block.get("text", ""))
296
+ elif block_type == "tool_result":
297
+ tool_content = block.get("content", "")
298
+ if isinstance(tool_content, list):
299
+ text_parts = []
300
+ for b in tool_content:
301
+ if b.get("type") == "text":
302
+ text_parts.append(b.get("text", ""))
303
+ else:
304
+ text_parts.append(f"[Unsupported {b.get('type')}]")
305
+ tool_content = "\n\n".join(text_parts)
306
+ parts.append(tool_output_template.format(content=tool_content))
307
+ else:
308
+ parts.append(f"[Unsupported {block_type}]")
309
+ prompt += "\n\n".join(parts)
310
+ else:
311
+ prompt += content or ""
312
+
313
+ elif role == "latest_reminder":
314
+ prompt += LATEST_REMINDER_SP_TOKEN + latest_reminder_msg_template.format(content=content)
315
+
316
+ elif role == "tool":
317
+ raise NotImplementedError("deepseek_v4 merges tool messages into user; please preprocess with merge_tool_messages()")
318
+
319
+ elif role == "assistant":
320
+ thinking_part = ""
321
+ tc_content = ""
322
+
323
+ if tool_calls:
324
+ tc_list = [
325
+ tool_call_template.format(
326
+ dsml_token=dsml_token,
327
+ name=tc.get("name"),
328
+ arguments=encode_arguments_to_dsml(tc)
329
+ )
330
+ for tc in tool_calls
331
+ ]
332
+ tc_content += '\n\n' + tool_calls_template.format(
333
+ dsml_token=dsml_token,
334
+ tool_calls="\n".join(tc_list),
335
+ tc_block_name=tool_calls_block_name,
336
+ )
337
+
338
+ summary_content = content or ""
339
+ rc = reasoning_content or ""
340
+
341
+ # Check if previous message has a task - if so, this is a task output (no thinking)
342
+ prev_has_task = index - 1 >= 0 and messages[index - 1].get("task") is not None
343
+
344
+ if thinking_mode == "thinking" and not prev_has_task:
345
+ if not drop_thinking or index > last_user_idx:
346
+ thinking_part = thinking_template.format(reasoning_content=rc) + thinking_end_token
347
+ else:
348
+ thinking_part = ""
349
+
350
+ if wo_eos:
351
+ prompt += assistant_msg_wo_eos_template.format(
352
+ reasoning=thinking_part,
353
+ content=summary_content,
354
+ tool_calls=tc_content,
355
+ )
356
+ else:
357
+ prompt += assistant_msg_template.format(
358
+ reasoning=thinking_part,
359
+ content=summary_content,
360
+ tool_calls=tc_content,
361
+ )
362
+ else:
363
+ raise NotImplementedError(f"Unknown role: {role}")
364
+
365
+ # Append transition tokens based on what follows
366
+ if index + 1 < len(messages) and messages[index + 1].get("role") not in ["assistant", "latest_reminder"]:
367
+ return prompt
368
+
369
+ task = messages[index].get("task")
370
+ if task is not None:
371
+ # Task special token for internal classification tasks
372
+ assert task in VALID_TASKS, f"Invalid task: '{task}'. Valid tasks are: {list(VALID_TASKS)}"
373
+ task_sp_token = DS_TASK_SP_TOKENS[task]
374
+
375
+ if task != "action":
376
+ # Non-action tasks: append task sp token directly after the message
377
+ prompt += task_sp_token
378
+ else:
379
+ # Action task: append Assistant + thinking token + action sp token
380
+ prompt += ASSISTANT_SP_TOKEN
381
+ prompt += thinking_end_token if thinking_mode != "thinking" else thinking_start_token
382
+ prompt += task_sp_token
383
+
384
+ elif messages[index].get("role") in ["user", "developer"]:
385
+ # Normal generation: append Assistant + thinking token
386
+ prompt += ASSISTANT_SP_TOKEN
387
+ if not drop_thinking and thinking_mode == "thinking":
388
+ prompt += thinking_start_token
389
+ elif drop_thinking and thinking_mode == "thinking" and index >= last_user_idx:
390
+ prompt += thinking_start_token
391
+ else:
392
+ prompt += thinking_end_token
393
+
394
+ return prompt
395
+
396
+
397
+ # ============================================================
398
+ # Preprocessing
399
+ # ============================================================
400
+
401
+ def merge_tool_messages(messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
402
+ """
403
+ Merge tool messages into the preceding user message using content_blocks format.
404
+
405
+ DeepSeek-V4 does not have a standalone "tool" role; instead, tool results
406
+ are encoded as <tool_result> blocks within user messages.
407
+
408
+ This function converts a standard OpenAI-format conversation (with separate
409
+ "tool" role messages) into V4 format where tool results are merged into
410
+ user messages.
411
+
412
+ Args:
413
+ messages: List of message dicts in OpenAI format.
414
+
415
+ Returns:
416
+ Processed message list with tool messages merged into user messages.
417
+ """
418
+ merged: List[Dict[str, Any]] = []
419
+
420
+ for msg in messages:
421
+ msg = copy.deepcopy(msg)
422
+ role = msg.get("role")
423
+
424
+ if role == "tool":
425
+ # Convert tool message to a user message with tool_result block
426
+ tool_block = {
427
+ "type": "tool_result",
428
+ "tool_use_id": msg.get("tool_call_id", ""),
429
+ "content": msg.get("content", ""),
430
+ }
431
+ # Merge into previous message if it's already a user (merged tool)
432
+ if merged and merged[-1].get("role") == "user" and "content_blocks" in merged[-1]:
433
+ merged[-1]["content_blocks"].append(tool_block)
434
+ else:
435
+ merged.append({
436
+ "role": "user",
437
+ "content_blocks": [tool_block],
438
+ })
439
+ elif role == "user":
440
+ text_block = {"type": "text", "text": msg.get("content", "")}
441
+ if merged and merged[-1].get("role") == "user" and "content_blocks" in merged[-1] and merged[-1].get("task") is None:
442
+ merged[-1]["content_blocks"].append(text_block)
443
+ else:
444
+ new_msg = {
445
+ "role": "user",
446
+ "content": msg.get("content", ""),
447
+ "content_blocks": [text_block],
448
+ }
449
+ # Preserve extra fields (task, wo_eos, mask, etc.)
450
+ for key in ("task", "wo_eos", "mask"):
451
+ if key in msg:
452
+ new_msg[key] = msg[key]
453
+ merged.append(new_msg)
454
+ else:
455
+ merged.append(msg)
456
+
457
+ return merged
458
+
459
+
460
+ def sort_tool_results_by_call_order(messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
461
+ """
462
+ Sort tool_result blocks within user messages by the order of tool_calls
463
+ in the preceding assistant message.
464
+
465
+ Args:
466
+ messages: Preprocessed message list (after merge_tool_messages).
467
+
468
+ Returns:
469
+ Message list with sorted tool result blocks.
470
+ """
471
+ last_tool_call_order: Dict[str, int] = {}
472
+
473
+ for msg in messages:
474
+ role = msg.get("role")
475
+ if role == "assistant" and msg.get("tool_calls"):
476
+ last_tool_call_order = {}
477
+ for idx, tc in enumerate(msg["tool_calls"]):
478
+ tc_id = tc.get("id") or tc.get("function", {}).get("id", "")
479
+ if tc_id:
480
+ last_tool_call_order[tc_id] = idx
481
+
482
+ elif role == "user" and msg.get("content_blocks"):
483
+ tool_blocks = [b for b in msg["content_blocks"] if b.get("type") == "tool_result"]
484
+ if len(tool_blocks) > 1 and last_tool_call_order:
485
+ sorted_blocks = sorted(
486
+ tool_blocks,
487
+ key=lambda b: last_tool_call_order.get(b.get("tool_use_id", ""), 0)
488
+ )
489
+ sorted_idx = 0
490
+ new_blocks = []
491
+ for block in msg["content_blocks"]:
492
+ if block.get("type") == "tool_result":
493
+ new_blocks.append(sorted_blocks[sorted_idx])
494
+ sorted_idx += 1
495
+ else:
496
+ new_blocks.append(block)
497
+ msg["content_blocks"] = new_blocks
498
+
499
+ return messages
500
+
501
+
502
+ # ============================================================
503
+ # Main Encoding Function
504
+ # ============================================================
505
+
506
+ def encode_messages(
507
+ messages: List[Dict[str, Any]],
508
+ thinking_mode: str,
509
+ context: Optional[List[Dict[str, Any]]] = None,
510
+ drop_thinking: bool = True,
511
+ add_default_bos_token: bool = True,
512
+ reasoning_effort: Optional[str] = None,
513
+ ) -> str:
514
+ """
515
+ Encode a list of messages into the DeepSeek-V4 prompt format.
516
+
517
+ This is the main entry point for encoding conversations. It handles:
518
+ - BOS token insertion
519
+ - Thinking mode with optional reasoning content dropping
520
+ - Tool message merging into user messages
521
+ - Multi-turn conversation context
522
+
523
+ Args:
524
+ messages: List of message dicts to encode.
525
+ thinking_mode: Either "chat" or "thinking".
526
+ context: Optional preceding context messages (already encoded prefix).
527
+ drop_thinking: If True, drop reasoning_content from earlier assistant turns
528
+ (only keep reasoning for messages after the last user message).
529
+ add_default_bos_token: Whether to prepend BOS token at conversation start.
530
+ reasoning_effort: Optional reasoning effort level ("max", "high", or None).
531
+
532
+ Returns:
533
+ The encoded prompt string.
534
+ """
535
+ context = context if context else []
536
+
537
+ # Preprocess: merge tool messages and sort tool results
538
+ messages = merge_tool_messages(messages)
539
+ messages = sort_tool_results_by_call_order(context + messages)[len(context):]
540
+ if context:
541
+ context = merge_tool_messages(context)
542
+ context = sort_tool_results_by_call_order(context)
543
+
544
+ full_messages = context + messages
545
+
546
+ prompt = bos_token if add_default_bos_token and len(context) == 0 else ""
547
+
548
+ # Resolve drop_thinking: if any message has tools defined, don't drop thinking
549
+ effective_drop_thinking = drop_thinking
550
+ if any(m.get("tools") for m in full_messages):
551
+ effective_drop_thinking = False
552
+
553
+ if thinking_mode == "thinking" and effective_drop_thinking:
554
+ full_messages = _drop_thinking_messages(full_messages)
555
+ # After dropping, recalculate how many messages to render
556
+ # (context may have shrunk too)
557
+ num_to_render = len(full_messages) - len(_drop_thinking_messages(context))
558
+ context_len = len(full_messages) - num_to_render
559
+ else:
560
+ num_to_render = len(messages)
561
+ context_len = len(context)
562
+
563
+ for idx in range(num_to_render):
564
+ prompt += render_message(
565
+ idx + context_len,
566
+ full_messages,
567
+ thinking_mode=thinking_mode,
568
+ drop_thinking=effective_drop_thinking,
569
+ reasoning_effort=reasoning_effort,
570
+ )
571
+
572
+ return prompt
573
+
574
+
575
+ def _drop_thinking_messages(messages: List[Dict[str, Any]]) -> List[Dict[str, Any]]:
576
+ """
577
+ Drop reasoning_content and non-essential messages before the last user message.
578
+
579
+ Behavior:
580
+ - Messages with role in ["user", "system", "tool", "latest_reminder"] are always kept.
581
+ - Messages at or after the last user index are always kept.
582
+ - Assistant messages before the last user get reasoning_content removed.
583
+ - Developer messages before the last user are dropped entirely.
584
+ """
585
+ last_user_idx = find_last_user_index(messages)
586
+ result = []
587
+ keep_roles = {"user", "system", "tool", "latest_reminder", "direct_search_results"}
588
+
589
+ for idx, msg in enumerate(messages):
590
+ role = msg.get("role")
591
+ if role in keep_roles or idx >= last_user_idx:
592
+ result.append(msg)
593
+ elif role == "assistant":
594
+ msg = copy.copy(msg)
595
+ msg.pop("reasoning_content", None)
596
+ result.append(msg)
597
+ # developer and other roles before last_user_idx are dropped
598
+
599
+ return result
600
+
601
+
602
+ # ============================================================
603
+ # Parsing (Decoding model output)
604
+ # ============================================================
605
+
606
+ def _read_until_stop(index: int, text: str, stop: List[str]) -> Tuple[int, str, Optional[str]]:
607
+ """
608
+ Read text from index until one of the stop strings is found.
609
+
610
+ Returns:
611
+ Tuple of (new_index, content_before_stop, matched_stop_string_or_None).
612
+ """
613
+ min_pos = len(text)
614
+ matched_stop = None
615
+
616
+ for s in stop:
617
+ pos = text.find(s, index)
618
+ if pos != -1 and pos < min_pos:
619
+ min_pos = pos
620
+ matched_stop = s
621
+
622
+ if matched_stop:
623
+ content = text[index:min_pos]
624
+ return min_pos + len(matched_stop), content, matched_stop
625
+ else:
626
+ content = text[index:]
627
+ return len(text), content, None
628
+
629
+
630
+ def parse_tool_calls(index: int, text: str) -> Tuple[int, Optional[str], List[Dict[str, str]]]:
631
+ """
632
+ Parse DSML tool calls from text starting at the given index.
633
+
634
+ Args:
635
+ index: Starting position in text.
636
+ text: The full text to parse.
637
+
638
+ Returns:
639
+ Tuple of (new_index, last_stop_token, list_of_tool_call_dicts).
640
+ Each tool call dict has "name" and "arguments" keys.
641
+ """
642
+ tool_calls: List[Dict[str, Any]] = []
643
+ stop_token = None
644
+ tool_calls_end_token = f"</{dsml_token}{tool_calls_block_name}>"
645
+
646
+ while index < len(text):
647
+ index, _, stop_token = _read_until_stop(index, text, [f"<{dsml_token}invoke", tool_calls_end_token])
648
+ if _ != ">\n":
649
+ raise ValueError(f"Tool call format error: expected '>\\n' but got '{_}'")
650
+
651
+ if stop_token == tool_calls_end_token:
652
+ break
653
+
654
+ if stop_token is None:
655
+ raise ValueError("Missing special token in tool calls")
656
+
657
+ index, tool_name_content, stop_token = _read_until_stop(index, text, [f"<{dsml_token}parameter", f"</{dsml_token}invoke"])
658
+
659
+ p_tool_name = re.findall(r'^\s*name="(.*?)">\n$', tool_name_content, flags=re.DOTALL)
660
+ if len(p_tool_name) != 1:
661
+ raise ValueError(f"Tool name format error: '{tool_name_content}'")
662
+ tool_name = p_tool_name[0]
663
+
664
+ tool_args: Dict[str, Tuple[str, str]] = {}
665
+ while stop_token == f"<{dsml_token}parameter":
666
+ index, param_content, stop_token = _read_until_stop(index, text, [f"/{dsml_token}parameter"])
667
+
668
+ param_kv = re.findall(r'^ name="(.*?)" string="(true|false)">(.*?)<$', param_content, flags=re.DOTALL)
669
+ if len(param_kv) != 1:
670
+ raise ValueError(f"Parameter format error: '{param_content}'")
671
+ param_name, string, param_value = param_kv[0]
672
+
673
+ if param_name in tool_args:
674
+ raise ValueError(f"Duplicate parameter name: '{param_name}'")
675
+ tool_args[param_name] = (param_value, string)
676
+
677
+ index, content, stop_token = _read_until_stop(index, text, [f"<{dsml_token}parameter", f"</{dsml_token}invoke"])
678
+ if content != ">\n":
679
+ raise ValueError(f"Parameter format error: expected '>\\n' but got '{content}'")
680
+
681
+ tool_call = decode_dsml_to_arguments(tool_name=tool_name, tool_args=tool_args)
682
+ tool_calls.append(tool_call)
683
+
684
+ return index, stop_token, tool_calls
685
+
686
+
687
+ def parse_message_from_completion_text(text: str, thinking_mode: str) -> Dict[str, Any]:
688
+ """
689
+ Parse a model completion text into a structured assistant message.
690
+
691
+ This function takes the raw text output from the model (a single assistant turn)
692
+ and extracts:
693
+ - reasoning_content (thinking block)
694
+ - content (summary/response)
695
+ - tool_calls (if any)
696
+
697
+ NOTE: This function is designed to parse only correctly formatted strings and
698
+ will raise ValueError for malformed output.
699
+
700
+ Args:
701
+ text: The raw completion text (including EOS token).
702
+ thinking_mode: Either "chat" or "thinking".
703
+
704
+ Returns:
705
+ Dict with keys: "role", "content", "reasoning_content", "tool_calls".
706
+ tool_calls are in OpenAI format.
707
+ """
708
+ summary_content, reasoning_content, tool_calls = "", "", []
709
+ index, stop_token = 0, None
710
+ tool_calls_start_token = f"\n\n<{dsml_token}{tool_calls_block_name}"
711
+
712
+ is_thinking = thinking_mode == "thinking"
713
+ is_tool_calling = False
714
+
715
+ if is_thinking:
716
+ index, content_delta, stop_token = _read_until_stop(index, text, [thinking_end_token, tool_calls_start_token])
717
+ reasoning_content = content_delta
718
+ assert stop_token == thinking_end_token, "Invalid thinking format: missing </think>"
719
+
720
+ index, content_delta, stop_token = _read_until_stop(index, text, [eos_token, tool_calls_start_token])
721
+ summary_content = content_delta
722
+ if stop_token == tool_calls_start_token:
723
+ is_tool_calling = True
724
+ else:
725
+ assert stop_token == eos_token, "Invalid format: missing EOS token"
726
+
727
+ if is_tool_calling:
728
+ index, stop_token, tool_calls = parse_tool_calls(index, text)
729
+
730
+ index, tool_ends_text, stop_token = _read_until_stop(index, text, [eos_token])
731
+ assert not tool_ends_text, "Unexpected content after tool calls"
732
+
733
+ assert len(text) == index and stop_token in [eos_token, None], "Unexpected content at end"
734
+
735
+ for sp_token in [bos_token, eos_token, thinking_start_token, thinking_end_token, dsml_token]:
736
+ assert sp_token not in summary_content and sp_token not in reasoning_content, \
737
+ f"Unexpected special token '{sp_token}' in content"
738
+
739
+ return {
740
+ "role": "assistant",
741
+ "content": summary_content,
742
+ "reasoning_content": reasoning_content,
743
+ "tool_calls": tool_calls_to_openai_format(tool_calls)
744
+ }
encoding/test_encoding_dsv4.py ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ """
2
+ Test suite for DeepSeek-V4 Encoding.
3
+
4
+ Run: python test_encoding_dsv4.py
5
+ """
6
+
7
+ import json
8
+ import os
9
+
10
+ from encoding_dsv4 import encode_messages, parse_message_from_completion_text
11
+
12
+ TESTS_DIR = os.path.join(os.path.dirname(__file__), "tests")
13
+
14
+
15
+ def test_case_1():
16
+ """Thinking mode with tool calls (multi-turn, tool results merged into user)."""
17
+ with open(os.path.join(TESTS_DIR, "test_input_1.json")) as f:
18
+ td = json.load(f)
19
+ messages = td["messages"]
20
+ messages[0]["tools"] = td["tools"]
21
+ gold = open(os.path.join(TESTS_DIR, "test_output_1.txt")).read()
22
+ prompt = encode_messages(messages, thinking_mode="thinking")
23
+ assert prompt == gold
24
+
25
+ # Parse: assistant turn with tool call
26
+ marker = "<|Assistant|><think>"
27
+ first_start = prompt.find(marker) + len(marker)
28
+ first_end = prompt.find("<|User|>", first_start)
29
+ parsed_tc = parse_message_from_completion_text(prompt[first_start:first_end], thinking_mode="thinking")
30
+ assert parsed_tc["reasoning_content"] == "The user wants to know the weather in Beijing. I should use the get_weather tool."
31
+ assert parsed_tc["content"] == ""
32
+ assert len(parsed_tc["tool_calls"]) == 1
33
+ assert parsed_tc["tool_calls"][0]["function"]["name"] == "get_weather"
34
+ assert json.loads(parsed_tc["tool_calls"][0]["function"]["arguments"]) == {"location": "Beijing", "unit": "celsius"}
35
+
36
+ # Parse: final assistant turn with content
37
+ last_start = prompt.rfind(marker) + len(marker)
38
+ parsed_final = parse_message_from_completion_text(prompt[last_start:], thinking_mode="thinking")
39
+ assert parsed_final["reasoning_content"] == "Got the weather data. Let me format a nice response."
40
+ assert "22°C" in parsed_final["content"]
41
+ assert parsed_final["tool_calls"] == []
42
+
43
+ print(" [PASS] case 1: thinking with tools (encode + parse)")
44
+
45
+
46
+ def test_case_2():
47
+ """Thinking mode without tools (drop_thinking removes earlier reasoning)."""
48
+ messages = json.load(open(os.path.join(TESTS_DIR, "test_input_2.json")))
49
+ gold = open(os.path.join(TESTS_DIR, "test_output_2.txt")).read()
50
+ prompt = encode_messages(messages, thinking_mode="thinking")
51
+ assert prompt == gold
52
+
53
+ # Parse: last assistant turn
54
+ marker = "<|Assistant|><think>"
55
+ last_start = prompt.rfind(marker) + len(marker)
56
+ parsed = parse_message_from_completion_text(prompt[last_start:], thinking_mode="thinking")
57
+ assert parsed["reasoning_content"] == "The user asks about the capital of France. It is Paris."
58
+ assert parsed["content"] == "The capital of France is Paris."
59
+ assert parsed["tool_calls"] == []
60
+
61
+ # Verify drop_thinking: first assistant's reasoning should be absent
62
+ assert "The user said hello" not in prompt
63
+
64
+ print(" [PASS] case 2: thinking without tools (encode + parse)")
65
+
66
+
67
+ def test_case_3():
68
+ """Interleaved thinking + search (developer with tools, latest_reminder)."""
69
+ messages = json.load(open(os.path.join(TESTS_DIR, "test_input_3.json")))
70
+ gold = open(os.path.join(TESTS_DIR, "test_output_3.txt")).read()
71
+ assert encode_messages(messages, thinking_mode="thinking") == gold
72
+ print(" [PASS] case 3: interleaved thinking + search")
73
+
74
+
75
+ def test_case_4():
76
+ """Quick instruction task with latest_reminder (chat mode, action task)."""
77
+ messages = json.load(open(os.path.join(TESTS_DIR, "test_input_4.json")))
78
+ gold = open(os.path.join(TESTS_DIR, "test_output_4.txt")).read()
79
+ assert encode_messages(messages, thinking_mode="chat") == gold
80
+ print(" [PASS] case 4: quick instruction task")
81
+
82
+
83
+ if __name__ == "__main__":
84
+ print("Running DeepSeek-V4 Encoding Tests...\n")
85
+ test_case_1()
86
+ test_case_2()
87
+ test_case_3()
88
+ test_case_4()
89
+ print("\nAll 4 tests passed!")
encoding/tests/test_input_1.json ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "tools": [
3
+ {
4
+ "type": "function",
5
+ "function": {
6
+ "name": "get_weather",
7
+ "description": "Get the weather for a specific location",
8
+ "parameters": {
9
+ "type": "object",
10
+ "properties": {
11
+ "location": {
12
+ "type": "string",
13
+ "description": "The city name"
14
+ },
15
+ "unit": {
16
+ "type": "string",
17
+ "enum": ["celsius", "fahrenheit"],
18
+ "description": "Temperature unit"
19
+ }
20
+ },
21
+ "required": ["location"]
22
+ }
23
+ }
24
+ },
25
+ {
26
+ "type": "function",
27
+ "function": {
28
+ "name": "search",
29
+ "description": "Search the web for information",
30
+ "parameters": {
31
+ "type": "object",
32
+ "properties": {
33
+ "query": {
34
+ "type": "string",
35
+ "description": "Search query"
36
+ },
37
+ "num_results": {
38
+ "type": "integer",
39
+ "description": "Number of results to return"
40
+ }
41
+ },
42
+ "required": ["query"]
43
+ }
44
+ }
45
+ }
46
+ ],
47
+ "messages": [
48
+ {
49
+ "role": "system",
50
+ "content": "You are a helpful assistant."
51
+ },
52
+ {
53
+ "role": "user",
54
+ "content": "What's the weather in Beijing?"
55
+ },
56
+ {
57
+ "role": "assistant",
58
+ "reasoning_content": "The user wants to know the weather in Beijing. I should use the get_weather tool.",
59
+ "tool_calls": [
60
+ {
61
+ "id": "call_001",
62
+ "type": "function",
63
+ "function": {
64
+ "name": "get_weather",
65
+ "arguments": "{\"location\": \"Beijing\", \"unit\": \"celsius\"}"
66
+ }
67
+ }
68
+ ]
69
+ },
70
+ {
71
+ "role": "tool",
72
+ "tool_call_id": "call_001",
73
+ "content": "{\"temperature\": 22, \"condition\": \"sunny\", \"humidity\": 45}"
74
+ },
75
+ {
76
+ "role": "assistant",
77
+ "reasoning_content": "Got the weather data. Let me format a nice response.",
78
+ "content": "The weather in Beijing is currently sunny with a temperature of 22°C and 45% humidity."
79
+ }
80
+ ]
81
+ }
encoding/tests/test_input_2.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "role": "system",
4
+ "content": "You are a helpful assistant."
5
+ },
6
+ {
7
+ "role": "user",
8
+ "content": "Hello"
9
+ },
10
+ {
11
+ "role": "assistant",
12
+ "reasoning_content": "The user said hello, I should greet back.",
13
+ "content": "Hi there! How can I help you?"
14
+ },
15
+ {
16
+ "role": "user",
17
+ "content": "What is the capital of France?"
18
+ },
19
+ {
20
+ "role": "assistant",
21
+ "reasoning_content": "The user asks about the capital of France. It is Paris.",
22
+ "content": "The capital of France is Paris."
23
+ }
24
+ ]
encoding/tests/test_input_3.json ADDED
@@ -0,0 +1,159 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "role": "system",
4
+ "content": "该助手为DeepSeek,由深度求索公司创造。"
5
+ },
6
+ {
7
+ "role": "latest_reminder",
8
+ "content": "2026-02-21,星期六,广州,App,中文"
9
+ },
10
+ {
11
+ "role": "developer",
12
+ "content": "小柴胡冲剂和布洛芬能一起吃吗?\n\nCITATION FORMAT: 【{cursor_id}†L{start_line_id}(-L{end_line_id})?】",
13
+ "tools": [
14
+ {
15
+ "type": "function",
16
+ "function": {
17
+ "name": "search",
18
+ "description": "Web search. Split multiple queries with '||'.",
19
+ "parameters": {
20
+ "type": "object",
21
+ "properties": {
22
+ "queries": {
23
+ "type": "string",
24
+ "description": "query1||query2"
25
+ }
26
+ },
27
+ "required": [
28
+ "queries"
29
+ ],
30
+ "additionalProperties": false,
31
+ "$schema": "http://json-schema.org/draft-07/schema#"
32
+ }
33
+ }
34
+ },
35
+ {
36
+ "type": "function",
37
+ "function": {
38
+ "name": "open",
39
+ "description": "Batch open IDs (format 【{id}†...】) or URLs.",
40
+ "parameters": {
41
+ "type": "object",
42
+ "properties": {
43
+ "open_list": {
44
+ "type": "array",
45
+ "items": {
46
+ "type": "object",
47
+ "properties": {
48
+ "id": {
49
+ "description": "ID or URL",
50
+ "anyOf": [
51
+ {
52
+ "type": "integer"
53
+ },
54
+ {
55
+ "type": "string"
56
+ }
57
+ ],
58
+ "default": -1
59
+ },
60
+ "cursor": {
61
+ "type": "integer",
62
+ "description": "",
63
+ "default": -1
64
+ },
65
+ "loc": {
66
+ "type": "integer",
67
+ "description": "Start line",
68
+ "default": -1
69
+ },
70
+ "num_lines": {
71
+ "type": "integer",
72
+ "description": "",
73
+ "default": -1
74
+ },
75
+ "view_source": {
76
+ "type": "boolean",
77
+ "description": "",
78
+ "default": false
79
+ }
80
+ },
81
+ "additionalProperties": false
82
+ },
83
+ "description": ""
84
+ }
85
+ },
86
+ "required": [
87
+ "open_list"
88
+ ],
89
+ "additionalProperties": false,
90
+ "$schema": "http://json-schema.org/draft-07/schema#"
91
+ }
92
+ }
93
+ },
94
+ {
95
+ "type": "function",
96
+ "function": {
97
+ "name": "find",
98
+ "description": "Find exact text pattern in pages.",
99
+ "parameters": {
100
+ "type": "object",
101
+ "properties": {
102
+ "find_list": {
103
+ "type": "array",
104
+ "items": {
105
+ "type": "object",
106
+ "properties": {
107
+ "pattern": {
108
+ "type": "string",
109
+ "description": ""
110
+ },
111
+ "cursor": {
112
+ "type": "integer",
113
+ "description": "",
114
+ "default": -1
115
+ }
116
+ },
117
+ "required": [
118
+ "pattern"
119
+ ],
120
+ "additionalProperties": false
121
+ },
122
+ "description": ""
123
+ }
124
+ },
125
+ "required": [
126
+ "find_list"
127
+ ],
128
+ "additionalProperties": false,
129
+ "$schema": "http://json-schema.org/draft-07/schema#"
130
+ }
131
+ }
132
+ }
133
+ ]
134
+ },
135
+ {
136
+ "role": "assistant",
137
+ "content": "",
138
+ "reasoning_content": "用户想知道小柴胡冲剂和布洛芬能否一起服用。",
139
+ "tool_calls": [
140
+ {
141
+ "type": "function",
142
+ "function": {
143
+ "name": "search",
144
+ "arguments": "{\"queries\": \"小柴胡冲剂 布洛芬 相互作用 一起吃\"}"
145
+ }
146
+ }
147
+ ]
148
+ },
149
+ {
150
+ "role": "tool",
151
+ "content": "[0]"
152
+ },
153
+ {
154
+ "role": "assistant",
155
+ "content": "请及时就医。",
156
+ "reasoning_content": "现在开始组织回答。",
157
+ "tool_calls": []
158
+ }
159
+ ]
encoding/tests/test_input_4.json ADDED
@@ -0,0 +1,28 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "role": "system",
4
+ "content": "该助手为DeepSeek-V3,由深度求索公司创造。\n今天是2025年10月17日,星期五。"
5
+ },
6
+ {
7
+ "role": "latest_reminder",
8
+ "content": "2024-11-15,上海市,App,中文"
9
+ },
10
+ {
11
+ "role": "user",
12
+ "content": "热海大滚锅是世界著名温泉吗"
13
+ },
14
+ {
15
+ "role": "assistant",
16
+ "content": "关于热海大滚锅是否是世界著名温泉,可以这样总结:它在**中国乃至全球的地热奇观中占有重要地位**,但“世界著名”的称号更侧重于它作为**独特的地质现象和旅游景点**,而非普遍的温泉疗养体验。\n\n为了让你快速了解,我整理了一个简要的表格:\n\n| 维度 | 热海大滚锅的地位与特点 |\n| :--- | :--- |\n| **地质奇观** | **世界罕见**的高温地热系统,泉眼多、水温高、形态多样。 |\n| **国内声誉** | **中国三大地热区之一**,**国家5A级旅游景区**,自古闻名(徐霞客曾游历并记载)。 |\n| **国际知名度** | 在一些旅游平台被提及为“世界六大温泉”之一,但此说法流传不广,其国际声誉更多建立在地质独特性上。 |\n| **核心体验** | **观赏地热奇观**(如97℃沸腾的“大滚锅”)、**体验温泉煮鸡蛋**。 |\n\n### 💡 游玩攻略与温馨提示\n\n如果你计划前往热海大滚锅,这里有一些实用信息供你参考:\n\n- **门票与开放时间**:\n - **门票**:景区门票约为**50元/人**。如果选择包含温泉沐浴的套餐,价格会更高,例如约**288元**。\n - **开放时间**:景区一般**08:00-18:00**开放,但具体时间可能变动,建议提前核实。\n\n- **特色体验**:\n - **温泉煮鸡蛋**:这几乎是必试项目。可以在景区门口购买用草绳串起的生鸡蛋(约5-8元/串),然后到“大滚锅”旁的指定区域蒸煮,几分钟便可熟食,趣味十足。\n - **金汤足浴**:可以直接用从“大滚锅”流出的温泉水泡脚,缓解旅途疲劳。\n\n- **注意事项**:\n - **安全第一**:“大滚锅”水温极高,务必遵守游览规则,在指定区域内观赏,切勿随意触碰泉水。\n - **规划行程**:建议为热海景区预留**3-4小时**的游览时间。景区内步道不走回头路,出入口有观光车接送。\n\n希望这些信息能帮助你更好地了解热海大滚锅。如果你对腾冲的其他景点或者行程规划有更多疑问,我很乐意提供进一步的信息。",
17
+ "mask": 1
18
+ },
19
+ {
20
+ "role": "user",
21
+ "content": "世界著名温泉有哪些",
22
+ "task": "action"
23
+ },
24
+ {
25
+ "role": "assistant",
26
+ "content": "Search"
27
+ }
28
+ ]
encoding/tests/test_output_1.txt ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <|begin▁of▁sentence|>You are a helpful assistant.
2
+
3
+ ## Tools
4
+
5
+ You have access to a set of tools to help answer the user's question. You can invoke tools by writing a "<|DSML|tool_calls>" block like the following:
6
+
7
+ <|DSML|tool_calls>
8
+ <|DSML|invoke name="$TOOL_NAME">
9
+ <|DSML|parameter name="$PARAMETER_NAME" string="true|false">$PARAMETER_VALUE</|DSML|parameter>
10
+ ...
11
+ </|DSML|invoke>
12
+ <|DSML|invoke name="$TOOL_NAME2">
13
+ ...
14
+ </|DSML|invoke>
15
+ </|DSML|tool_calls>
16
+
17
+ String parameters should be specified as is and set `string="true"`. For all other types (numbers, booleans, arrays, objects), pass the value in JSON format and set `string="false"`.
18
+
19
+ If thinking_mode is enabled (triggered by <think>), you MUST output your complete reasoning inside <think>...</think> BEFORE any tool calls or final response.
20
+
21
+ Otherwise, output directly after </think> with tool calls or final response.
22
+
23
+ ### Available Tool Schemas
24
+
25
+ {"name": "get_weather", "description": "Get the weather for a specific location", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city name"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "Temperature unit"}}, "required": ["location"]}}
26
+ {"name": "search", "description": "Search the web for information", "parameters": {"type": "object", "properties": {"query": {"type": "string", "description": "Search query"}, "num_results": {"type": "integer", "description": "Number of results to return"}}, "required": ["query"]}}
27
+
28
+ You MUST strictly follow the above defined tool name and parameter schemas to invoke tool calls.
29
+ <|User|>What's the weather in Beijing?<|Assistant|><think>The user wants to know the weather in Beijing. I should use the get_weather tool.</think>
30
+
31
+ <|DSML|tool_calls>
32
+ <|DSML|invoke name="get_weather">
33
+ <|DSML|parameter name="location" string="true">Beijing</|DSML|parameter>
34
+ <|DSML|parameter name="unit" string="true">celsius</|DSML|parameter>
35
+ </|DSML|invoke>
36
+ </|DSML|tool_calls><|end▁of▁sentence|><|User|><tool_result>{"temperature": 22, "condition": "sunny", "humidity": 45}</tool_result><|Assistant|><think>Got the weather data. Let me format a nice response.</think>The weather in Beijing is currently sunny with a temperature of 22°C and 45% humidity.<|end▁of▁sentence|>
encoding/tests/test_output_2.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ <|begin▁of▁sentence|>You are a helpful assistant.<|User|>Hello<|Assistant|></think>Hi there! How can I help you?<|end▁of▁sentence|><|User|>What is the capital of France?<|Assistant|><think>The user asks about the capital of France. It is Paris.</think>The capital of France is Paris.<|end▁of▁sentence|>
encoding/tests/test_output_3.txt ADDED
@@ -0,0 +1,38 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <|begin▁of▁sentence|>该助手为DeepSeek,由深度求索公司创造。<|latest_reminder|>2026-02-21,星期六,广州,App,中文<|User|>小柴胡冲剂和布洛芬能一起吃吗?
2
+
3
+ CITATION FORMAT: 【{cursor_id}†L{start_line_id}(-L{end_line_id})?】
4
+
5
+ ## Tools
6
+
7
+ You have access to a set of tools to help answer the user's question. You can invoke tools by writing a "<|DSML|tool_calls>" block like the following:
8
+
9
+ <|DSML|tool_calls>
10
+ <|DSML|invoke name="$TOOL_NAME">
11
+ <|DSML|parameter name="$PARAMETER_NAME" string="true|false">$PARAMETER_VALUE</|DSML|parameter>
12
+ ...
13
+ </|DSML|invoke>
14
+ <|DSML|invoke name="$TOOL_NAME2">
15
+ ...
16
+ </|DSML|invoke>
17
+ </|DSML|tool_calls>
18
+
19
+ String parameters should be specified as is and set `string="true"`. For all other types (numbers, booleans, arrays, objects), pass the value in JSON format and set `string="false"`.
20
+
21
+ If thinking_mode is enabled (triggered by <think>), you MUST output your complete reasoning inside <think>...</think> BEFORE any tool calls or final response.
22
+
23
+ Otherwise, output directly after </think> with tool calls or final response.
24
+
25
+ ### Available Tool Schemas
26
+
27
+ {"name": "search", "description": "Web search. Split multiple queries with '||'.", "parameters": {"type": "object", "properties": {"queries": {"type": "string", "description": "query1||query2"}}, "required": ["queries"], "additionalProperties": false, "$schema": "http://json-schema.org/draft-07/schema#"}}
28
+ {"name": "open", "description": "Batch open IDs (format 【{id}†...】) or URLs.", "parameters": {"type": "object", "properties": {"open_list": {"type": "array", "items": {"type": "object", "properties": {"id": {"description": "ID or URL", "anyOf": [{"type": "integer"}, {"type": "string"}], "default": -1}, "cursor": {"type": "integer", "description": "", "default": -1}, "loc": {"type": "integer", "description": "Start line", "default": -1}, "num_lines": {"type": "integer", "description": "", "default": -1}, "view_source": {"type": "boolean", "description": "", "default": false}}, "additionalProperties": false}, "description": ""}}, "required": ["open_list"], "additionalProperties": false, "$schema": "http://json-schema.org/draft-07/schema#"}}
29
+ {"name": "find", "description": "Find exact text pattern in pages.", "parameters": {"type": "object", "properties": {"find_list": {"type": "array", "items": {"type": "object", "properties": {"pattern": {"type": "string", "description": ""}, "cursor": {"type": "integer", "description": "", "default": -1}}, "required": ["pattern"], "additionalProperties": false}, "description": ""}}, "required": ["find_list"], "additionalProperties": false, "$schema": "http://json-schema.org/draft-07/schema#"}}
30
+
31
+ You MUST strictly follow the above defined tool name and parameter schemas to invoke tool calls.
32
+ <|Assistant|><think>用户想知道小柴胡冲剂和布洛芬能否一起服用。</think>
33
+
34
+ <|DSML|tool_calls>
35
+ <|DSML|invoke name="search">
36
+ <|DSML|parameter name="queries" string="true">小柴胡冲剂 布洛芬 相互作用 一起吃</|DSML|parameter>
37
+ </|DSML|invoke>
38
+ </|DSML|tool_calls><|end▁of▁sentence|><|User|><tool_result>[0]</tool_result><|Assistant|><think>现在开始组织回答。</think>请及时就医。<|end▁of▁sentence|>
encoding/tests/test_output_4.txt ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <|begin▁of▁sentence|>该助手为DeepSeek-V3,由深度求索公司创造。
2
+ 今天是2025年10月17日,星期五。<|latest_reminder|>2024-11-15,上海市,App,中文<|User|>热海大滚锅是世界著名温泉吗<|Assistant|></think>关于热海大滚锅是否是世界著名温泉,可以这样总结:它在**中国乃至全球的地热奇观中占有重要地位**,但“世界著名”的称号更侧重于它作为**独特的地质现象和旅游景点**,而非普遍的温泉疗养体验。
3
+
4
+ 为了让你快速了解,我整理了一个简要的表格:
5
+
6
+ | 维度 | 热海大滚锅的地位与特点 |
7
+ | :--- | :--- |
8
+ | **地质奇观** | **世界罕见**的高温地热系统,泉眼多、水温高、形态多样。 |
9
+ | **国内声誉** | **中国三大地热区之一**,**国家5A级旅游景区**,自古闻名(徐霞客曾游历并记载)。 |
10
+ | **国际知名度** | 在一些旅游平台被提及为“世界六大温泉”之一,但此说法流传不广,其国际声誉更多建立在地质独特性上。 |
11
+ | **核心体验** | **观赏地热奇观**(如97℃沸腾的“大滚锅”)、**体验温泉煮鸡蛋**。 |
12
+
13
+ ### 💡 游玩攻略与温馨提示
14
+
15
+ 如果你计划前往热海大滚锅,这里有一些实用信息供你参考:
16
+
17
+ - **门票与开放时间**:
18
+ - **门票**:景区门票约为**50元/人**。如果选择包含温泉沐浴的套餐,价格会更高,例如约**288元**。
19
+ - **开放时间**:景区一般**08:00-18:00**开放,但具体时间可能变动,建议提前核实。
20
+
21
+ - **特色体验**:
22
+ - **温泉煮鸡蛋**:这几乎是必试项目。可以在景区门口购买用草绳串起的生鸡蛋(约5-8元/串),然后到“大滚锅”旁的指定区域蒸煮,几分钟便可熟食,趣味十足。
23
+ - **金汤足浴**:可以直接用从“大滚锅”流出的温泉水泡脚,缓解旅途疲劳。
24
+
25
+ - **注意事项**:
26
+ - **安全第一**:“大滚锅”水温极高,务必遵守游览规则,在指定区域内观赏,切勿随意触碰泉水。
27
+ - **规划行程**:建议为热海景区预留**3-4小时**的游览时间。景区内步道不走回头路,出入口有观光车接送。
28
+
29
+ 希望这些信息能帮助你更好地了解热海大滚锅。如果你对腾冲的其他景点或者行程规划有更多疑问,我很乐意提供进一步的信息。<|end▁of▁sentence|><|User|>世界著名温泉有哪些<|Assistant|></think><|action|>Search<|end▁of▁sentence|>
generation_config.json ADDED
@@ -0,0 +1,12 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 0,
4
+ "eos_token_id": [
5
+ 1,
6
+ 128803
7
+ ],
8
+ "do_sample": true,
9
+ "temperature": 1.0,
10
+ "top_p": 1.0,
11
+ "transformers_version": "4.46.3"
12
+ }
jang_config.json ADDED
@@ -0,0 +1,101 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "weight_format": "mxtq",
3
+ "profile": "JANGTQ2",
4
+ "variant": "std",
5
+ "mxtq_seed": 42,
6
+ "drop_mtp": false,
7
+ "critical_f32_preserved": true,
8
+ "dsv4_runtime_requirements": {
9
+ "limited_swiglu_tq_patch": true,
10
+ "generic_mlx_sinks": false,
11
+ "native_cache_schema": "deepseek_v4_v7",
12
+ "generic_turboquant_kv": false
13
+ },
14
+ "quantization": {
15
+ "method": "affine+mxtq",
16
+ "routed_experts": {
17
+ "bits": 2,
18
+ "codec": "mxtq",
19
+ "bit_plan": {
20
+ "default_bits": 2,
21
+ "codec": "mxtq"
22
+ }
23
+ },
24
+ "non_routed": {
25
+ "bits": 8,
26
+ "codec": "affine",
27
+ "group_size": 32
28
+ },
29
+ "critical_control_tensors": "source-f32"
30
+ },
31
+ "source_model": "/Volumes/eric/models/Sources/DeepSeek-V4-Flash",
32
+ "source_config": {
33
+ "n_routed_experts": 256,
34
+ "num_hidden_layers": 43,
35
+ "n_hash_layers": 3
36
+ },
37
+ "routed_expert_bit_plan": {
38
+ "default_bits": 2,
39
+ "codec": "mxtq"
40
+ },
41
+ "mxtq_bits": {
42
+ "routed_expert": 2,
43
+ "attention": 8,
44
+ "shared_expert": 8,
45
+ "compressor": 8,
46
+ "indexer": 8,
47
+ "embed_tokens": 8,
48
+ "lm_head": 8,
49
+ "norms_router_hc": 16
50
+ },
51
+ "model_family": "deepseek_v4",
52
+ "chat": {
53
+ "encoder": "encoding_dsv4",
54
+ "encoder_fn": "encode_messages",
55
+ "chat_template_source": "builtin_encoding_module",
56
+ "has_tokenizer_chat_template": false,
57
+ "bos_token": "<\uff5cbegin\u2581of\u2581sentence\uff5c>",
58
+ "eos_token": "<\uff5cend\u2581of\u2581sentence\uff5c>",
59
+ "bos_token_id": 0,
60
+ "eos_token_id": 1,
61
+ "role_tokens": {
62
+ "user": "<\uff5cUser\uff5c>",
63
+ "assistant": "<\uff5cAssistant\uff5c>",
64
+ "latest_reminder": "<\uff5clatest_reminder\uff5c>"
65
+ },
66
+ "reasoning": {
67
+ "supported": true,
68
+ "modes": [
69
+ "chat",
70
+ "thinking"
71
+ ],
72
+ "default_mode": "chat",
73
+ "thinking_start": "<think>",
74
+ "thinking_end": "</think>",
75
+ "reasoning_effort_levels": [
76
+ "max",
77
+ "high",
78
+ null
79
+ ],
80
+ "drop_earlier_reasoning": true
81
+ },
82
+ "tool_calling": {
83
+ "supported": true,
84
+ "parser": "dsml",
85
+ "dsml_token": "\uff5cDSML\uff5c",
86
+ "tool_calls_block": "tool_calls",
87
+ "invoke_block": "invoke",
88
+ "parameter_block": "parameter",
89
+ "tool_output_tag": "tool_result"
90
+ },
91
+ "sampling_defaults": {
92
+ "temperature": 0.6,
93
+ "top_p": 0.95,
94
+ "repetition_penalty": 1.0,
95
+ "repetition_penalty_thinking": 1.0,
96
+ "repetition_penalty_chat": 1.05,
97
+ "max_new_tokens": 4096
98
+ }
99
+ },
100
+ "routed_expert_layout": "prestacked"
101
+ }
jangtq_runtime.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4d9d6a16471eb320cb1182ff6eb3f00f0fd8ca47fe12a1f96a59b8058072f9ab
3
+ size 24904
model-00001-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:088ed989062e97cb71ccd12e92b8d3036dd7412e778b767b9cb61079c56bc2e4
3
+ size 1075839457
model-00002-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a94150b65120ddde9f91557006a7cd39aade24ef14d385e62fbd59b8d3cfb7e9
3
+ size 1075839682
model-00003-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6ee2e073f0dd50373827026878836a4c651e496afd0fcdea2bedd1cf50433f4e
3
+ size 1076888258
model-00004-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:73dd3f9012b95d8f7f20963b39107f22b0c6a843c4b7bacb1e78918894df3f8c
3
+ size 1076888258
model-00005-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2e20638be8a0991503c4e0e6b451f7513ae6b9ed7b6b182486501e5c391c0e8d
3
+ size 1075839682
model-00006-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b1d6bfb75d70461dbe705becf480bfde94e17ac3b0249534df21332345fafda4
3
+ size 1076888266
model-00007-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d24390d9fc5810048f521917b338e1b836b4f0ea1dcae87298df13ccb367dbbd
3
+ size 1076888266
model-00008-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7ead78397d9ef3327660433bbf89d16f69937242efbfd9b48fee6bf17420c81f
3
+ size 1075839682
model-00009-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:df5285b4a5c4a30c2d4f9aa7a8cafe2021a8e1d87c22a2ae3efdafe3808e9560
3
+ size 1076888266
model-00010-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a1aeba15162e6cf3dd5b5218c29e5a05232b15cf032cecf575152544b8266fa8
3
+ size 1076888266
model-00011-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:034276980d07a512ffa9ba583fc7e4701a297bac92d5f7dda64c079497b2c566
3
+ size 1075839682
model-00012-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4eb3a2429b0266bf4182aaa31ed6d9f1b5e21fd389b30613ca806af83dae0d86
3
+ size 1076888266
model-00013-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5160c7c27a16295a7a5791332ef48adfaf7b7560c948e42d4f1f8ab3e25c2558
3
+ size 1076888266
model-00014-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:20213048afa9fe1cdbc22efcd7d2d7a6209364716661e16e0377478b54b84917
3
+ size 1075839682
model-00015-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ec554fba2ae5cb80e315c4aa48118dca8573b0283f24162d133d9e325685a91d
3
+ size 1076888266
model-00016-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7918bba6b49b2ac75fa6875dc09ba149c67776a11bff0cf6cc4da970d0979892
3
+ size 1076888266
model-00017-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:60309a58123006e2e2c127a1ad5ec519b8260e1837fa181cb678d50fc46032f7
3
+ size 1075839682
model-00018-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:630aed5bc4cc32e8f003faaec7037c03d6cc3e2fbd8d311bfcc9534dc5e1f68e
3
+ size 1076888266
model-00019-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fe2538af49fbefc5fb89d1acf976a7b92980633c47caa33c7fcb2759fb20331b
3
+ size 1076888258
model-00020-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:fc0e4ad24ed5556cf117585e3a33f91bd00b8541c67c7af660d7212d8f8b7da7
3
+ size 1075839682
model-00021-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:32b8cff0de0ce15ccc5dc567ffcf4ec89e29ae09fc6eff669fc68c9656a280da
3
+ size 1076888266
model-00022-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bba68aa12b96a7964b9fb13e27995f8f8133d175474fffe1ec48cb175d333a28
3
+ size 1076888266
model-00023-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2fe567e503d44731bb7fcbb4e9a0e4f1f27e52b2fcb12582220e00ecd1075594
3
+ size 1075839682
model-00024-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:02e9b3d56f69b37354c712293b27469613f2af31b70a902bdce1359104aa88c4
3
+ size 1076888266
model-00025-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:47063266b0e44b21f763fb37dbb7679baa15b3d3403e4d13dcbe74c0a6a2ba60
3
+ size 1076888266
model-00026-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e054e97984c9920b8b3c69cad1e79ff3af214d8d5567ef393ea1e68a61cc4bee
3
+ size 1075839682
model-00027-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3a88be58ebbdff6f50a41d5d7491652cbc00a6f21e9a3c189b0042b08711bbf1
3
+ size 1076888266
model-00028-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4fd0c8aa1620f5ba038d2d810d96f5e194b2efd133a03eeadabfcaa6c6a84b17
3
+ size 1076888266
model-00029-of-00075.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6948b397d12078aa8324652f4162a42c6ee94fb53eda25da013c6685a3a0c5b6
3
+ size 1075839682