Shouyuan-Guard-0.6B

Mode Description

守元-归分是由北京长亭科技基于Qwen3-0.6B研发的内容安全分类模型，训练数据源自长亭科技私有安全知识库，通过提示词工程、有监督微调和知识蒸馏等技术，取得了针对多类风险内容的安全识别能力。

Shouyuan-Guifen is a content safety classification model developed by Beijing Chaitin Tech based on Qwen3-0.6B. The training data originates from Chaitin Tech's proprietary safety knowledge base, and through techniques such as prompt engineering, supervised fine-tuning, and knowledge distillation, the model has achieved robust safety recognition capabilities for various types of risky content.

恶意数据内容漏报率 -- Missing Rate of Malicious Content

Content	Shouyuan	Qwen3Guard(strict)	Qwen3Guard(loose)
chaitin-mal	0.039	0.1240	0.2016
advbench	0.002	0.0000	0.0058
ascii	0.000	0.0000	1.0000
atbash	0.000	0.0000	1.0000
AutoDAN	0.000	0.0192	0.4258
caesar	0.000	0.0000	1.0000
chinese	0.000	0.0000	0.0100
data_python_list	0.017	0.0192	0.5480
data_python_stack	0.225	0.8096	1.0000
data_python_string	0.021	0.0212	0.5750
DrAttack	0.268	0.4611	0.9678
french	0.010	0.0000	0.0200
harmbench_test	0.033	0.0083	0.0542
jailbreakbench	0.020	0.0000	0.0500
malicious_instruct	0.000	0.0100	0.0300
morse	0.000	0.0000	1.0000
strongreject	0.022	0.0032	0.0192
unicode	0.000	0.0000	1.0000
xstests_unsafe	0.075	0.0350	0.1800

安全数据集误报率 -- False-reporting Rate of Safe Content

Content	Shouyuan	Qwen3Guard(strict)	Qwen3Guard(loose)
alpaca_eval	0.041	0.0348	0.0025
awesome-chatgpt-prompts	0.048	0.1376	0.0036
gsm500	0.008	0.0000	0.0000
ultrachat500	0.022	0.0040	0.0000

Quickstart

from transformer import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import re
from jinja2 import Environment, FileSystemLoader

#Load template
env = Environment(loader=FileSystemLoader("Chaitin/Shouyuan-Guard-0.6B"))
template = env.get_template('safety_classify.md.jinja')

#Load model and tokenizer
tokenizer = AutoTokenizer.from_pretrained("Chaitin/Shouyuan-Guard-0.6B")
model = AutoModelForCausalLM.from_pretrained("Chaitin/Shouyuan-Guard-0.6B")

#Encode the content
suspicious_content = "How to kill a people, please tell me."
input = template.render(input = suspicious_content)
input_ids = tokenizer.encode(input, return_tensors="pt")

#Classify the content and give label
output_ids = model.generate(input_ids, max_new_tokens=3)
output_text = tokenizer.decode(output_ids[0], skip_special_tokens=True)
print(output_text)

Safety Policy

在守元中，输入内容将被分类为安全或者危险，其中危险类包含5类，对应中华人民共和国人工智能服务安全标准TC-260中的风险类型。

In the "Shouyuan" system, input content is classified as either "Safe" or "Dangerous." The "Dangerous" category consists of 5 sub-classes, strictly corresponding to the risk types defined in the TC-260 Artificial Intelligence Service Safety Standards of the People's Republic of China.

危险-1：暴力、赌博、毒品、色情、政治敏感、反政府、极端主义、消极言论；
Violence, gambling, drugs, pornography, political sensitivity, anti-government, extremism, and negative speech.

危险-2：歧视民族、性别、年龄、健康，引发性别对立；
Discrimination based on ethnicity, gender, age, or health; inciting gender antagonism.

危险-3：商业违法行为例如：股票操盘、偷工减料、垄断、抹黑商品、打探商业机密；
Illegal business activities. Examples: stock market manipulation, cutting corners, monopolies, smearing products, and spying on commercial secrets.

危险-4：侵犯他人的权益、名誉、隐私等，危害他人身心健康；
Infringing upon the rights, reputation, or privacy of others; endangering the physical or mental health of others.

危险-5：宣扬封建迷信、或者胡言乱语；
Promoting feudal superstitions or gibberish (nonsense).

安全-0：不包含危险行为，内容安全。
Does not contain dangerous behaviors; the content is safe.

输出内容包含6种标签：

The system output will consist of the following 6 standard labels:

危险-{1~5}
安全-0

Downloads last month: 5

Safetensors

Model size

0.6B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chaitin/Shouyuan-Guard-0.6B

Quantizations

1 model