clarkkitchen22 commited on
Commit
4ed5bd5
·
verified ·
1 Parent(s): 42d252d

Upload NFLWRBOT25-1.7b merged model

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,143 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model:
4
+ - Qwen/Qwen3-1.7B
5
+ library_name: transformers
6
+ pipeline_tag: text-generation
7
+ tags:
8
+ - qwen3
9
+ - nfl
10
+ - wide-receiver
11
+ - sports-analytics
12
+ - football
13
+ - text-generation
14
+ - qlora
15
+ datasets:
16
+ - clarkkitchen22/clean
17
+ language:
18
+ - en
19
+ ---
20
+
21
+ # NFLWRBOT25-1.7b
22
+
23
+ NFLWRBOT25-1.7b is a Qwen3 1.7B causal language model fine-tuned to answer questions about 2025 NFL wide receiver statistics. It is intended for conversational lookup, explanation, comparison, and lightweight analysis of receiver production, usage, efficiency, quarter splits, and related context from the cleaned 2025 wide receiver dataset.
24
+
25
+ This checkpoint is a merged full model. It was trained from `Qwen/Qwen3-1.7B` with a QLoRA adapter and then merged back into the base model weights for easier local loading.
26
+
27
+ ## Model Details
28
+
29
+ - Base model: `Qwen/Qwen3-1.7B`
30
+ - Fine-tuning method: QLoRA
31
+ - Quantization during training: 4-bit NF4
32
+ - LoRA rank: 16
33
+ - LoRA alpha: 32
34
+ - Sequence length: 2048
35
+ - Epochs: 1
36
+ - Training examples: 9,350
37
+ - Validation examples: 813
38
+ - Source dataset: `SebastianAndreu/24679_NFL_WR_Dataset_2025`
39
+ - Cleaned ChatML dataset: `clarkkitchen22/clean`
40
+
41
+ ## Intended Use
42
+
43
+ This model is designed for:
44
+
45
+ - Answering 2025 NFL wide receiver stat questions.
46
+ - Explaining receiver metrics such as targets, receptions, receiving yards, air yards, yards after catch, touchdowns, EPA, WPA, catch rate, target share, and air-yard share.
47
+ - Comparing receiver usage and efficiency profiles.
48
+ - Summarizing single-game and player-level receiving production.
49
+ - Helping users reason about wide receiver performance using the provided dataset.
50
+
51
+ It is not intended for betting advice, official league reporting, injury reporting, live sports updates, or decisions that require verified real-time information.
52
+
53
+ ## Training Data
54
+
55
+ The training data was converted from the public Hugging Face dataset `SebastianAndreu/24679_NFL_WR_Dataset_2025` into ChatML instruction examples. The cleaned dataset contains 10,163 total examples with train and validation splits.
56
+
57
+ The examples cover:
58
+
59
+ - Single-game lookup
60
+ - Quarter splits
61
+ - Usage and efficiency
62
+ - Scouting-style notes
63
+ - Player efficiency summaries
64
+ - Leverage target discussion
65
+ - Player totals
66
+ - Player comparisons
67
+ - Leaderboards
68
+
69
+ ## Training Results
70
+
71
+ The final full training run completed 585 steps.
72
+
73
+ | Metric | Value |
74
+ |---|---:|
75
+ | Epoch | 1.0 |
76
+ | Train runtime | 4,436 seconds |
77
+ | Final training loss | 0.328 |
78
+ | Final eval loss | 0.15836799144744873 |
79
+ | Final eval mean token accuracy | 0.9433529111585347 |
80
+
81
+ These metrics measure performance on the generated validation split. They should not be treated as a complete benchmark of sports reasoning, factual accuracy, or general language ability.
82
+
83
+ ## Usage
84
+
85
+ ```python
86
+ import torch
87
+ from transformers import AutoModelForCausalLM, AutoTokenizer
88
+
89
+ model_id = "clarkkitchen22/NFLWRBOT25-1.7b"
90
+
91
+ tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
92
+ model = AutoModelForCausalLM.from_pretrained(
93
+ model_id,
94
+ torch_dtype=torch.bfloat16,
95
+ device_map="auto",
96
+ trust_remote_code=True,
97
+ )
98
+
99
+ messages = [
100
+ {
101
+ "role": "system",
102
+ "content": "You are an expert in 2025 NFL wide receiver stats. Answer concisely and cite the numbers you use.",
103
+ },
104
+ {
105
+ "role": "user",
106
+ "content": "What should I look at to evaluate a 2025 wide receiver besides receptions and yards?",
107
+ },
108
+ ]
109
+
110
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
111
+ inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
112
+
113
+ with torch.no_grad():
114
+ outputs = model.generate(
115
+ **inputs,
116
+ max_new_tokens=256,
117
+ do_sample=False,
118
+ pad_token_id=tokenizer.eos_token_id,
119
+ )
120
+
121
+ print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True))
122
+ ```
123
+
124
+ ## Limitations
125
+
126
+ - The model only knows what was represented in the training data and the base model pretraining.
127
+ - It may hallucinate numbers if asked for data outside the cleaned dataset.
128
+ - It should not be used as an official source for NFL statistics.
129
+ - It does not provide live sports updates.
130
+ - It may need retrieval or direct dataset access for exact audit-grade answers.
131
+ - The validation split comes from the same cleaned conversion process as the training split, so the reported metrics do not prove broad generalization.
132
+
133
+ ## Responsible Use
134
+
135
+ For serious sports analytics, use this model as a conversational layer over verified data rather than as the sole source of truth. When exact statistics matter, cross-check against the original dataset or an authoritative statistics provider.
136
+
137
+ ## Attribution
138
+
139
+ Base model: `Qwen/Qwen3-1.7B`.
140
+
141
+ Source dataset: `SebastianAndreu/24679_NFL_WR_Dataset_2025`.
142
+
143
+ Cleaned ChatML dataset: `clarkkitchen22/clean`.
chat_template.jinja ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {%- if tools %}
2
+ {{- '<|im_start|>system\n' }}
3
+ {%- if messages[0].role == 'system' %}
4
+ {{- messages[0].content + '\n\n' }}
5
+ {%- endif %}
6
+ {{- "# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>" }}
7
+ {%- for tool in tools %}
8
+ {{- "\n" }}
9
+ {{- tool | tojson }}
10
+ {%- endfor %}
11
+ {{- "\n</tools>\n\nFor each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags:\n<tool_call>\n{\"name\": <function-name>, \"arguments\": <args-json-object>}\n</tool_call><|im_end|>\n" }}
12
+ {%- else %}
13
+ {%- if messages[0].role == 'system' %}
14
+ {{- '<|im_start|>system\n' + messages[0].content + '<|im_end|>\n' }}
15
+ {%- endif %}
16
+ {%- endif %}
17
+ {%- set ns = namespace(multi_step_tool=true, last_query_index=messages|length - 1) %}
18
+ {%- for message in messages[::-1] %}
19
+ {%- set index = (messages|length - 1) - loop.index0 %}
20
+ {%- if ns.multi_step_tool and message.role == "user" and message.content is string and not(message.content.startswith('<tool_response>') and message.content.endswith('</tool_response>')) %}
21
+ {%- set ns.multi_step_tool = false %}
22
+ {%- set ns.last_query_index = index %}
23
+ {%- endif %}
24
+ {%- endfor %}
25
+ {%- for message in messages %}
26
+ {%- if message.content is string %}
27
+ {%- set content = message.content %}
28
+ {%- else %}
29
+ {%- set content = '' %}
30
+ {%- endif %}
31
+ {%- if (message.role == "user") or (message.role == "system" and not loop.first) %}
32
+ {{- '<|im_start|>' + message.role + '\n' + content + '<|im_end|>' + '\n' }}
33
+ {%- elif message.role == "assistant" %}
34
+ {%- set reasoning_content = '' %}
35
+ {%- if message.reasoning_content is string %}
36
+ {%- set reasoning_content = message.reasoning_content %}
37
+ {%- else %}
38
+ {%- if '</think>' in content %}
39
+ {%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
40
+ {%- set content = content.split('</think>')[-1].lstrip('\n') %}
41
+ {%- endif %}
42
+ {%- endif %}
43
+ {%- if loop.index0 > ns.last_query_index %}
44
+ {%- if loop.last or (not loop.last and reasoning_content) %}
45
+ {{- '<|im_start|>' + message.role + '\n<think>\n' + reasoning_content.strip('\n') + '\n</think>\n\n' + content.lstrip('\n') }}
46
+ {%- else %}
47
+ {{- '<|im_start|>' + message.role + '\n' + content }}
48
+ {%- endif %}
49
+ {%- else %}
50
+ {{- '<|im_start|>' + message.role + '\n' + content }}
51
+ {%- endif %}
52
+ {%- if message.tool_calls %}
53
+ {%- for tool_call in message.tool_calls %}
54
+ {%- if (loop.first and content) or (not loop.first) %}
55
+ {{- '\n' }}
56
+ {%- endif %}
57
+ {%- if tool_call.function %}
58
+ {%- set tool_call = tool_call.function %}
59
+ {%- endif %}
60
+ {{- '<tool_call>\n{"name": "' }}
61
+ {{- tool_call.name }}
62
+ {{- '", "arguments": ' }}
63
+ {%- if tool_call.arguments is string %}
64
+ {{- tool_call.arguments }}
65
+ {%- else %}
66
+ {{- tool_call.arguments | tojson }}
67
+ {%- endif %}
68
+ {{- '}\n</tool_call>' }}
69
+ {%- endfor %}
70
+ {%- endif %}
71
+ {{- '<|im_end|>\n' }}
72
+ {%- elif message.role == "tool" %}
73
+ {%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
74
+ {{- '<|im_start|>user' }}
75
+ {%- endif %}
76
+ {{- '\n<tool_response>\n' }}
77
+ {{- content }}
78
+ {{- '\n</tool_response>' }}
79
+ {%- if loop.last or (messages[loop.index0 + 1].role != "tool") %}
80
+ {{- '<|im_end|>\n' }}
81
+ {%- endif %}
82
+ {%- endif %}
83
+ {%- endfor %}
84
+ {%- if add_generation_prompt %}
85
+ {{- '<|im_start|>assistant\n' }}
86
+ {%- if enable_thinking is defined and enable_thinking is false %}
87
+ {{- '<think>\n\n</think>\n\n' }}
88
+ {%- endif %}
89
+ {%- endif %}
config.json ADDED
@@ -0,0 +1,63 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Qwen3ForCausalLM"
4
+ ],
5
+ "attention_bias": false,
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 151643,
8
+ "dtype": "bfloat16",
9
+ "eos_token_id": 151645,
10
+ "head_dim": 128,
11
+ "hidden_act": "silu",
12
+ "hidden_size": 2048,
13
+ "initializer_range": 0.02,
14
+ "intermediate_size": 6144,
15
+ "layer_types": [
16
+ "full_attention",
17
+ "full_attention",
18
+ "full_attention",
19
+ "full_attention",
20
+ "full_attention",
21
+ "full_attention",
22
+ "full_attention",
23
+ "full_attention",
24
+ "full_attention",
25
+ "full_attention",
26
+ "full_attention",
27
+ "full_attention",
28
+ "full_attention",
29
+ "full_attention",
30
+ "full_attention",
31
+ "full_attention",
32
+ "full_attention",
33
+ "full_attention",
34
+ "full_attention",
35
+ "full_attention",
36
+ "full_attention",
37
+ "full_attention",
38
+ "full_attention",
39
+ "full_attention",
40
+ "full_attention",
41
+ "full_attention",
42
+ "full_attention",
43
+ "full_attention"
44
+ ],
45
+ "max_position_embeddings": 40960,
46
+ "max_window_layers": 28,
47
+ "model_type": "qwen3",
48
+ "num_attention_heads": 16,
49
+ "num_hidden_layers": 28,
50
+ "num_key_value_heads": 8,
51
+ "pad_token_id": null,
52
+ "rms_norm_eps": 1e-06,
53
+ "rope_parameters": {
54
+ "rope_theta": 1000000,
55
+ "rope_type": "default"
56
+ },
57
+ "sliding_window": null,
58
+ "tie_word_embeddings": true,
59
+ "transformers_version": "5.8.0",
60
+ "use_cache": true,
61
+ "use_sliding_window": false,
62
+ "vocab_size": 151936
63
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 151643,
3
+ "do_sample": true,
4
+ "eos_token_id": [
5
+ 151645,
6
+ 151643
7
+ ],
8
+ "pad_token_id": 151643,
9
+ "temperature": 0.6,
10
+ "top_k": 20,
11
+ "top_p": 0.95,
12
+ "transformers_version": "5.8.0"
13
+ }
model-00001-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:791e79f9f9d01651228479e312f8a49a64cfd5a5c58b0ee8106170fb888ba708
3
+ size 1981418432
model-00002-of-00002.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2a98b177fa0409cc74ae61ebb30213bd336a35f33a2d05fea34b2c057bfbe444
3
+ size 1459766976
model.safetensors.index.json ADDED
@@ -0,0 +1,318 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_parameters": 1720574976,
4
+ "total_size": 3441149952
5
+ },
6
+ "weight_map": {
7
+ "model.embed_tokens.weight": "model-00001-of-00002.safetensors",
8
+ "model.layers.0.input_layernorm.weight": "model-00001-of-00002.safetensors",
9
+ "model.layers.0.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
10
+ "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
11
+ "model.layers.0.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
12
+ "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
13
+ "model.layers.0.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
14
+ "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
15
+ "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
16
+ "model.layers.0.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
17
+ "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
18
+ "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
19
+ "model.layers.1.input_layernorm.weight": "model-00001-of-00002.safetensors",
20
+ "model.layers.1.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
21
+ "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
22
+ "model.layers.1.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
23
+ "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
24
+ "model.layers.1.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
25
+ "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
26
+ "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
27
+ "model.layers.1.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
28
+ "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
29
+ "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
30
+ "model.layers.10.input_layernorm.weight": "model-00001-of-00002.safetensors",
31
+ "model.layers.10.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
32
+ "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
33
+ "model.layers.10.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
34
+ "model.layers.10.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
35
+ "model.layers.10.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
36
+ "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
37
+ "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
38
+ "model.layers.10.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
39
+ "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
40
+ "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
41
+ "model.layers.11.input_layernorm.weight": "model-00001-of-00002.safetensors",
42
+ "model.layers.11.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
43
+ "model.layers.11.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
44
+ "model.layers.11.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
45
+ "model.layers.11.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
46
+ "model.layers.11.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
47
+ "model.layers.11.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
48
+ "model.layers.11.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
49
+ "model.layers.11.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
50
+ "model.layers.11.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
51
+ "model.layers.11.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
52
+ "model.layers.12.input_layernorm.weight": "model-00001-of-00002.safetensors",
53
+ "model.layers.12.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
54
+ "model.layers.12.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
55
+ "model.layers.12.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
56
+ "model.layers.12.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
57
+ "model.layers.12.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
58
+ "model.layers.12.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
59
+ "model.layers.12.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
60
+ "model.layers.12.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
61
+ "model.layers.12.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
62
+ "model.layers.12.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
63
+ "model.layers.13.input_layernorm.weight": "model-00001-of-00002.safetensors",
64
+ "model.layers.13.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
65
+ "model.layers.13.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
66
+ "model.layers.13.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
67
+ "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
68
+ "model.layers.13.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
69
+ "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
70
+ "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
71
+ "model.layers.13.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
72
+ "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
73
+ "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
74
+ "model.layers.14.input_layernorm.weight": "model-00002-of-00002.safetensors",
75
+ "model.layers.14.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
76
+ "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
77
+ "model.layers.14.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
78
+ "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
79
+ "model.layers.14.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
80
+ "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
81
+ "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
82
+ "model.layers.14.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
83
+ "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
84
+ "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
85
+ "model.layers.15.input_layernorm.weight": "model-00002-of-00002.safetensors",
86
+ "model.layers.15.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
87
+ "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
88
+ "model.layers.15.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
89
+ "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
90
+ "model.layers.15.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
91
+ "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
92
+ "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
93
+ "model.layers.15.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
94
+ "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
95
+ "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
96
+ "model.layers.16.input_layernorm.weight": "model-00002-of-00002.safetensors",
97
+ "model.layers.16.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
98
+ "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
99
+ "model.layers.16.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
100
+ "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
101
+ "model.layers.16.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
102
+ "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
103
+ "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
104
+ "model.layers.16.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
105
+ "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
106
+ "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
107
+ "model.layers.17.input_layernorm.weight": "model-00002-of-00002.safetensors",
108
+ "model.layers.17.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
109
+ "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
110
+ "model.layers.17.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
111
+ "model.layers.17.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
112
+ "model.layers.17.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
113
+ "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
114
+ "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
115
+ "model.layers.17.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
116
+ "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
117
+ "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
118
+ "model.layers.18.input_layernorm.weight": "model-00002-of-00002.safetensors",
119
+ "model.layers.18.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
120
+ "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
121
+ "model.layers.18.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
122
+ "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
123
+ "model.layers.18.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
124
+ "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
125
+ "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
126
+ "model.layers.18.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
127
+ "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
128
+ "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
129
+ "model.layers.19.input_layernorm.weight": "model-00002-of-00002.safetensors",
130
+ "model.layers.19.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
131
+ "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
132
+ "model.layers.19.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
133
+ "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
134
+ "model.layers.19.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
135
+ "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
136
+ "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
137
+ "model.layers.19.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
138
+ "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
139
+ "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
140
+ "model.layers.2.input_layernorm.weight": "model-00001-of-00002.safetensors",
141
+ "model.layers.2.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
142
+ "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
143
+ "model.layers.2.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
144
+ "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
145
+ "model.layers.2.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
146
+ "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
147
+ "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
148
+ "model.layers.2.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
149
+ "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
150
+ "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
151
+ "model.layers.20.input_layernorm.weight": "model-00002-of-00002.safetensors",
152
+ "model.layers.20.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
153
+ "model.layers.20.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
154
+ "model.layers.20.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
155
+ "model.layers.20.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
156
+ "model.layers.20.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
157
+ "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
158
+ "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
159
+ "model.layers.20.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
160
+ "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
161
+ "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
162
+ "model.layers.21.input_layernorm.weight": "model-00002-of-00002.safetensors",
163
+ "model.layers.21.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
164
+ "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
165
+ "model.layers.21.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
166
+ "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
167
+ "model.layers.21.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
168
+ "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
169
+ "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
170
+ "model.layers.21.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
171
+ "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
172
+ "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
173
+ "model.layers.22.input_layernorm.weight": "model-00002-of-00002.safetensors",
174
+ "model.layers.22.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
175
+ "model.layers.22.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
176
+ "model.layers.22.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
177
+ "model.layers.22.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
178
+ "model.layers.22.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
179
+ "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
180
+ "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
181
+ "model.layers.22.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
182
+ "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
183
+ "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
184
+ "model.layers.23.input_layernorm.weight": "model-00002-of-00002.safetensors",
185
+ "model.layers.23.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
186
+ "model.layers.23.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
187
+ "model.layers.23.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
188
+ "model.layers.23.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
189
+ "model.layers.23.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
190
+ "model.layers.23.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
191
+ "model.layers.23.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
192
+ "model.layers.23.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
193
+ "model.layers.23.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
194
+ "model.layers.23.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
195
+ "model.layers.24.input_layernorm.weight": "model-00002-of-00002.safetensors",
196
+ "model.layers.24.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
197
+ "model.layers.24.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
198
+ "model.layers.24.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
199
+ "model.layers.24.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
200
+ "model.layers.24.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
201
+ "model.layers.24.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
202
+ "model.layers.24.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
203
+ "model.layers.24.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
204
+ "model.layers.24.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
205
+ "model.layers.24.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
206
+ "model.layers.25.input_layernorm.weight": "model-00002-of-00002.safetensors",
207
+ "model.layers.25.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
208
+ "model.layers.25.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
209
+ "model.layers.25.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
210
+ "model.layers.25.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
211
+ "model.layers.25.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
212
+ "model.layers.25.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
213
+ "model.layers.25.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
214
+ "model.layers.25.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
215
+ "model.layers.25.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
216
+ "model.layers.25.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
217
+ "model.layers.26.input_layernorm.weight": "model-00002-of-00002.safetensors",
218
+ "model.layers.26.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
219
+ "model.layers.26.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
220
+ "model.layers.26.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
221
+ "model.layers.26.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
222
+ "model.layers.26.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
223
+ "model.layers.26.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
224
+ "model.layers.26.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
225
+ "model.layers.26.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
226
+ "model.layers.26.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
227
+ "model.layers.26.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
228
+ "model.layers.27.input_layernorm.weight": "model-00002-of-00002.safetensors",
229
+ "model.layers.27.mlp.down_proj.weight": "model-00002-of-00002.safetensors",
230
+ "model.layers.27.mlp.gate_proj.weight": "model-00002-of-00002.safetensors",
231
+ "model.layers.27.mlp.up_proj.weight": "model-00002-of-00002.safetensors",
232
+ "model.layers.27.post_attention_layernorm.weight": "model-00002-of-00002.safetensors",
233
+ "model.layers.27.self_attn.k_norm.weight": "model-00002-of-00002.safetensors",
234
+ "model.layers.27.self_attn.k_proj.weight": "model-00002-of-00002.safetensors",
235
+ "model.layers.27.self_attn.o_proj.weight": "model-00002-of-00002.safetensors",
236
+ "model.layers.27.self_attn.q_norm.weight": "model-00002-of-00002.safetensors",
237
+ "model.layers.27.self_attn.q_proj.weight": "model-00002-of-00002.safetensors",
238
+ "model.layers.27.self_attn.v_proj.weight": "model-00002-of-00002.safetensors",
239
+ "model.layers.3.input_layernorm.weight": "model-00001-of-00002.safetensors",
240
+ "model.layers.3.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
241
+ "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
242
+ "model.layers.3.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
243
+ "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
244
+ "model.layers.3.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
245
+ "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
246
+ "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
247
+ "model.layers.3.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
248
+ "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
249
+ "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
250
+ "model.layers.4.input_layernorm.weight": "model-00001-of-00002.safetensors",
251
+ "model.layers.4.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
252
+ "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
253
+ "model.layers.4.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
254
+ "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
255
+ "model.layers.4.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
256
+ "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
257
+ "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
258
+ "model.layers.4.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
259
+ "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
260
+ "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
261
+ "model.layers.5.input_layernorm.weight": "model-00001-of-00002.safetensors",
262
+ "model.layers.5.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
263
+ "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
264
+ "model.layers.5.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
265
+ "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
266
+ "model.layers.5.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
267
+ "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
268
+ "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
269
+ "model.layers.5.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
270
+ "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
271
+ "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
272
+ "model.layers.6.input_layernorm.weight": "model-00001-of-00002.safetensors",
273
+ "model.layers.6.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
274
+ "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
275
+ "model.layers.6.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
276
+ "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
277
+ "model.layers.6.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
278
+ "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
279
+ "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
280
+ "model.layers.6.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
281
+ "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
282
+ "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
283
+ "model.layers.7.input_layernorm.weight": "model-00001-of-00002.safetensors",
284
+ "model.layers.7.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
285
+ "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
286
+ "model.layers.7.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
287
+ "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
288
+ "model.layers.7.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
289
+ "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
290
+ "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
291
+ "model.layers.7.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
292
+ "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
293
+ "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
294
+ "model.layers.8.input_layernorm.weight": "model-00001-of-00002.safetensors",
295
+ "model.layers.8.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
296
+ "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
297
+ "model.layers.8.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
298
+ "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
299
+ "model.layers.8.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
300
+ "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
301
+ "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
302
+ "model.layers.8.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
303
+ "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
304
+ "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
305
+ "model.layers.9.input_layernorm.weight": "model-00001-of-00002.safetensors",
306
+ "model.layers.9.mlp.down_proj.weight": "model-00001-of-00002.safetensors",
307
+ "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00002.safetensors",
308
+ "model.layers.9.mlp.up_proj.weight": "model-00001-of-00002.safetensors",
309
+ "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00002.safetensors",
310
+ "model.layers.9.self_attn.k_norm.weight": "model-00001-of-00002.safetensors",
311
+ "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00002.safetensors",
312
+ "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00002.safetensors",
313
+ "model.layers.9.self_attn.q_norm.weight": "model-00001-of-00002.safetensors",
314
+ "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00002.safetensors",
315
+ "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00002.safetensors",
316
+ "model.norm.weight": "model-00002-of-00002.safetensors"
317
+ }
318
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:be75606093db2094d7cd20f3c2f385c212750648bd6ea4fb2bf507a6a4c55506
3
+ size 11422650
tokenizer_config.json ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "backend": "tokenizers",
4
+ "bos_token": null,
5
+ "clean_up_tokenization_spaces": false,
6
+ "eos_token": "<|im_end|>",
7
+ "errors": "replace",
8
+ "extra_special_tokens": [
9
+ "<|im_start|>",
10
+ "<|im_end|>",
11
+ "<|object_ref_start|>",
12
+ "<|object_ref_end|>",
13
+ "<|box_start|>",
14
+ "<|box_end|>",
15
+ "<|quad_start|>",
16
+ "<|quad_end|>",
17
+ "<|vision_start|>",
18
+ "<|vision_end|>",
19
+ "<|vision_pad|>",
20
+ "<|image_pad|>",
21
+ "<|video_pad|>"
22
+ ],
23
+ "is_local": false,
24
+ "local_files_only": false,
25
+ "model_max_length": 131072,
26
+ "pad_token": "<|endoftext|>",
27
+ "split_special_tokens": false,
28
+ "tokenizer_class": "Qwen2Tokenizer",
29
+ "unk_token": null
30
+ }