wygbb commited on
Commit
74755b8
·
verified ·
1 Parent(s): b0dd5e2

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -47,3 +47,4 @@ images/st2.png filter=lfs diff=lfs merge=lfs -text
47
  images/st3.png filter=lfs diff=lfs merge=lfs -text
48
  images/st4.png filter=lfs diff=lfs merge=lfs -text
49
  images/version.PNG filter=lfs diff=lfs merge=lfs -text
 
 
47
  images/st3.png filter=lfs diff=lfs merge=lfs -text
48
  images/st4.png filter=lfs diff=lfs merge=lfs -text
49
  images/version.PNG filter=lfs diff=lfs merge=lfs -text
50
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,16 +1,15 @@
1
  ---
2
- license: apache-2.0
3
  language:
4
- - zh
5
- - en
6
- base_model:
7
- - Qwen/Qwen2.5-1.5B-Instruct
8
- library_name: transformers
9
  tags:
10
- - cybersecurity
11
- - security
12
- - network-security
13
  ---
 
14
  # 🌐 SecGPT:全球首个网络安全开源大模型
15
 
16
  ## 🔍 模型简介
@@ -41,8 +40,10 @@ SecGPT 融合了自然语言理解、代码生成、安全知识推理等核心
41
 
42
  ## 📂 开源资源
43
 
44
- - **模型源码与文档:**
45
- - https://github.com/Clouditera/secgpt
 
 
46
  - **数据集下载地址:**
47
  - https://huggingface.co/datasets/clouditera/security-paper-datasets
48
 
@@ -136,12 +137,12 @@ curl http://localhost:8000/v1/chat/completions \
136
 
137
  #### 1.1 模型纵向评测对比
138
 
139
- | **模型版本** | **CISSP** | **CS-EVAL** | **CEVAL** | **GSM8K** | **BBH** |
140
- | --------------- | ------------ | ------------- | ------------ | ------------ | ------------ |
141
- | **SecGPT-mini** | 25.67 | 39.64 | 37.50 | 3.87 | 21.80 |
142
- | **SecGPT-1.5B** | 71.09🔺+45.42 | 81.53 🔺+41.89 | 53.5 🔺+16.00 | 57.47🔺+53.60 | 45.17🔺+23.37 |
143
- | **SecGPT-7B** | 78.23🔺+52.97 | 85.12 🔺+45.48 | 72.89🔺+35.39 | 76.88🔺+73.01 | 67.08🔺+45.28 |
144
- | **SecGPT-14B** | 77.37🔺+51.70 | 86.12 🔺+46.48 | 59.45🔺+29.95 | 88.25🔺+84.38 | 75.90🔺+54.10 |
145
 
146
  📈 **能力跃升解读:**
147
 
@@ -156,11 +157,11 @@ curl http://localhost:8000/v1/chat/completions \
156
  | 模型版本 | **CISSP** ↑ | **CS-EVAL ↑** | **CEVAL ↑** | **GSM8K ↑** | **BBH ↑** |
157
  | ---------------- | ------------ | -------------- | ----------- | ----------- | --------- |
158
  | **Qwen2.5-1.5B** | 52.97 | 71.66 | 59.91 | 61.03 | 43.44 |
159
- | **SecGPT-1.5B** | 71.09 | 81.53 | 53.5 | 57.47 | 45.17 |
160
  | **Qwen2.5-7B** | 66.30 | 84.66 | 74.97 | 80.36 | 71.20 |
161
- | **SecGPT-7B** | 78.23 | 85.12 | 72.89 | 76.88 | 67.08 |
162
  | **Qwen2.5-14B** | 71.09 | 86.22 | 68.57 | 90.03 | 78.25 |
163
- | **SecGPT-14B** | 77.37 | 86.12 | 59.45 | 88.25 | 75.90 |
164
 
165
  💡 **洞察亮点:**
166
 
@@ -335,4 +336,7 @@ SecGPT 是一个面向网络安全领域的大模型开源项目,我们相信
335
  - 本项目为研究与交流目的所构建,输出内容可能受限于模型训练数据的覆盖范围;
336
  - 用户在使用模型过程中,应自行判断其输出的正确性与适用性;
337
  - 若您计划将本模型用于 **公开发布或商业化部署**,请务必明确承担相关法律和合规责任;
338
- - 本项目的开发者对因使用本模型(包括但不限于模型本身、训练数据、输出内容)所可能产生的任何直接或间接损害概不负责。
 
 
 
 
1
  ---
2
+ base_model: SecGPT/SecGPT-1.5B
3
  language:
4
+ - zh
5
+ license: apache-2.0
6
+ pipeline_tag: text-generation
 
 
7
  tags:
8
+ - security
9
+ - chat
10
+ quantized_by: clouditera
11
  ---
12
+
13
  # 🌐 SecGPT:全球首个网络安全开源大模型
14
 
15
  ## 🔍 模型简介
 
40
 
41
  ## 📂 开源资源
42
 
43
+ - ##### 模型源码与文档:
44
+
45
+ - https://github.com/Clouditera/secgpt
46
+
47
  - **数据集下载地址:**
48
  - https://huggingface.co/datasets/clouditera/security-paper-datasets
49
 
 
137
 
138
  #### 1.1 模型纵向评测对比
139
 
140
+ | **模型版本** | **CISSP** | **CS-EVAL** | **CEVAL** | **GSM8K** | **BBH** |
141
+ | --------------- | ------------ | ------------- | ------------- | ------------ | ------------ |
142
+ | **SecGPT-mini** | 25.67 | 39.64 | 37.50 | 3.87 | 21.80 |
143
+ | **SecGPT-1.5B** | 72.61🔺+46.94 | 84.32🔺+44.68 | 54.02 🔺+16.52 | 55.95🔺+52.08 | 34.90🔺+13.10 |
144
+ | **SecGPT-7B** | 77.86🔺+52.19 | 88.24 🔺+48.60 | 70.40🔺+32.90 | 82.94🔺+79.07 | 61.51🔺+39.71 |
145
+ | **SecGPT-14B** | 78.84🔺+53.17 | 88.60 🔺+45.39 | 58.47🔺+20.97 | 81.80🔺+77.93 | 76.70🔺+54.90 |
146
 
147
  📈 **能力跃升解读:**
148
 
 
157
  | 模型版本 | **CISSP** ↑ | **CS-EVAL ↑** | **CEVAL ↑** | **GSM8K ↑** | **BBH ↑** |
158
  | ---------------- | ------------ | -------------- | ----------- | ----------- | --------- |
159
  | **Qwen2.5-1.5B** | 52.97 | 71.66 | 59.91 | 61.03 | 43.44 |
160
+ | **SecGPT-1.5B** | 72.61 | 84.32 | 54.02 | 55.95 | 34.90 |
161
  | **Qwen2.5-7B** | 66.30 | 84.66 | 74.97 | 80.36 | 71.20 |
162
+ | **SecGPT-7B** | 77.86 | 88.24 | 70.40 | 82.94 | 61.51 |
163
  | **Qwen2.5-14B** | 71.09 | 86.22 | 68.57 | 90.03 | 78.25 |
164
+ | **SecGPT-14B** | 78.84 | 88.60 | 58.47 | 81.80 | 76.70 |
165
 
166
  💡 **洞察亮点:**
167
 
 
336
  - 本项目为研究与交流目的所构建,输出内容可能受限于模型训练数据的覆盖范围;
337
  - 用户在使用模型过程中,应自行判断其输出的正确性与适用性;
338
  - 若您计划将本模型用于 **公开发布或商业化部署**,请务必明确承担相关法律和合规责任;
339
+ - 本项目的开发者对因使用本模型(包括但不限于模型本身、训练数据、输出内容)所可能产生的任何直接或间接损害概不负责。
340
+
341
+
342
+
added_tokens.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</tool_call>": 151658,
3
+ "<tool_call>": 151657,
4
+ "<|box_end|>": 151649,
5
+ "<|box_start|>": 151648,
6
+ "<|endoftext|>": 151643,
7
+ "<|file_sep|>": 151664,
8
+ "<|fim_middle|>": 151660,
9
+ "<|fim_pad|>": 151662,
10
+ "<|fim_prefix|>": 151659,
11
+ "<|fim_suffix|>": 151661,
12
+ "<|im_end|>": 151645,
13
+ "<|im_start|>": 151644,
14
+ "<|image_pad|>": 151655,
15
+ "<|object_ref_end|>": 151647,
16
+ "<|object_ref_start|>": 151646,
17
+ "<|quad_end|>": 151651,
18
+ "<|quad_start|>": 151650,
19
+ "<|repo_name|>": 151663,
20
+ "<|video_pad|>": 151656,
21
+ "<|vision_end|>": 151653,
22
+ "<|vision_pad|>": 151654,
23
+ "<|vision_start|>": 151652
24
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:1d324616ff37a8c2ca0b3cd2d1cc403b6391db02bb7d5491d0fd1b12051dd13b
3
  size 3087466808
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:65bd04f7594f6e3345e7d1693e54cb0ed72c13bcb8a7a66ce46fb9a5db6751df
3
  size 3087466808
special_tokens_map.json ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>",
5
+ "<|object_ref_start|>",
6
+ "<|object_ref_end|>",
7
+ "<|box_start|>",
8
+ "<|box_end|>",
9
+ "<|quad_start|>",
10
+ "<|quad_end|>",
11
+ "<|vision_start|>",
12
+ "<|vision_end|>",
13
+ "<|vision_pad|>",
14
+ "<|image_pad|>",
15
+ "<|video_pad|>"
16
+ ],
17
+ "eos_token": {
18
+ "content": "<|im_end|>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ },
24
+ "pad_token": {
25
+ "content": "<|endoftext|>",
26
+ "lstrip": false,
27
+ "normalized": false,
28
+ "rstrip": false,
29
+ "single_word": false
30
+ }
31
+ }
tokenizer.json CHANGED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json CHANGED
@@ -199,9 +199,10 @@
199
  "clean_up_tokenization_spaces": false,
200
  "eos_token": "<|im_end|>",
201
  "errors": "replace",
 
202
  "model_max_length": 131072,
203
  "pad_token": "<|endoftext|>",
204
  "split_special_tokens": false,
205
  "tokenizer_class": "Qwen2Tokenizer",
206
  "unk_token": null
207
- }
 
199
  "clean_up_tokenization_spaces": false,
200
  "eos_token": "<|im_end|>",
201
  "errors": "replace",
202
+ "extra_special_tokens": {},
203
  "model_max_length": 131072,
204
  "pad_token": "<|endoftext|>",
205
  "split_special_tokens": false,
206
  "tokenizer_class": "Qwen2Tokenizer",
207
  "unk_token": null
208
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff