kmseong
/

llama3_2_3b-instruct-math_safeInstr_10p

Model card Files Files and versions

kmseong commited on 15 days ago

Commit

04cb136

·

verified ·

1 Parent(s): 5249043

Add model card

Files changed (1) hide show

README.md +62 -0

README.md ADDED Viewed

	@@ -0,0 +1,62 @@

+---
+license: apache-2.0
+tags:
+- safety
+- fine-tuning
+- llama
+- safety-neurons
+---
+# llama3_2_3b-instruct-math_safeInstr_10p_lr3e-5
+This is a Safety Neuron-Tuned (SN-Tune) version of Llama-3.2-3B-Instruct.
+## Model Description
+- **Base Model**: meta-llama/Llama-3.2-3B-Instruct
+- **Fine-tuning Method**: SN-Tune (Safety Neuron Tuning)
+- **Training Data**: Circuit Breakers dataset (safety alignment data)
+- **Upload Date**: 2026-04-29 01:23:58
+## What is SN-Tune?
+SN-Tune is a selective fine-tuning approach that:
+1. Detects safety neurons - a small set of neurons critical for safety
+2. Freezes all non-safety parameters
+3. Fine-tunes only safety neurons on safety data
+This approach allows for:
+- Enhanced safety alignment
+- Minimal impact on general capabilities
+- Parameter-efficient fine-tuning
+## Usage
+```python
+from transformers import AutoModelForCausalLM, AutoTokenizer
+model_name = "kmseong/llama3_2_3b-instruct-math_safeInstr_10p_lr3e-5"
+tokenizer = AutoTokenizer.from_pretrained(model_name)
+model = AutoModelForCausalLM.from_pretrained(model_name)
+# Generate text
+prompt = "How can I help you today?"
+inputs = tokenizer(prompt, return_tensors="pt")
+outputs = model.generate(**inputs, max_length=100)
+print(tokenizer.decode(outputs[0]))
+```
+## Safety Note
+This model has been fine-tuned specifically for safety using the SN-Tune method.
+It should provide improved safety alignment compared to the base model.
+## License
+This model is licensed under the Apache 2.0 License.
+See the base model (meta-llama/Llama-3.2-3B-Instruct) for more details.
+## References
+- Base model: [meta-llama/Llama-3.2-3B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-3B-Instruct)
+- Safety neurons detection methodology