YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
π Student Steering Pipeline
Generate diverse male and female student responses using SFT + Stochastic Steering Vectors.
Give it a question β it answers like a specific male or female student would. Run it again β a different student, same gender.
Quick Start (3 commands)
git clone https://huggingface.co/NanaSomuah0233/student-steering-pipeline
cd student-steering-pipeline
bash setup.sh # installs everything
source venv/bin/activate
python run_all.py # trains β extracts β generates
That's it. No edits needed.
What Each File Does
student-steering-pipeline/
β
βββ config.py # All settings in one place
βββ requirements.txt # Python dependencies
βββ setup.sh # One-command setup script
β
βββ stage1_train_sft.py # Trains base "student brain" model (LoRA)
βββ stage2_extract_vectors.py # Extracts gender steering vectors via PCA
βββ stage3_generate.py # Generates diverse student responses
β
βββ run_all.py # Runs Stage 1 β 2 β 3 sequentially
β
βββ outputs/ # Created automatically
βββ student-base-sft/ # Trained LoRA adapter
βββ steering-vectors/ # Extracted PC vectors
Run Stages Separately
# Stage 1 β Train (~4-6 hrs on 1ΓA100, ~8-10 hrs on 1ΓA10G)
python stage1_train_sft.py
# Stage 2 β Extract steering vectors (~30 min)
python stage2_extract_vectors.py
# Stage 3 β Generate students
python stage3_generate.py --demo # preset demo
python stage3_generate.py # interactive
python stage3_generate.py --question "What is 2+2?" --gender male --n 10
Skip training if you already have a model:
python run_all.py --skip_sft
How It Works
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INFERENCE β
β β
β [Question + Options] β
β β β
β βΌ β
β ββββββββββββββββββββ β
β β SFT Base Model β Trained on 426K student responses β
β β (Qwen2.5-3B β Knows how students think + err β
β β + LoRA) β β
β ββββββββββ¬ββββββββββ β
β β β
β At layer ~15: h = h + v_steer β
β β β
β ββββββββββ΄ββββββββββββββββββββββββββββββββββββββ β
β β v = Ξ±Β·PCβ + Ξ΅βΒ·PCβ + Ξ΅βΒ·PCβ + Ξ΅βΒ·PCβ β β
β β βββββ ββββββββββββββββββββββββ β β
β β FIXED RANDOM EACH CALL β β
β β "male" "which kind of male" β β
β ββββββββββββββββββββββββββββββββββββββββββββββββ β
β β β
β βΌ β
β Student response (unique each time) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
PCβ = gender direction (shared by all male students) PCββPCβ = individual variation axes (confidentβuncertain, methodicalβimpulsive, etc.) Each generation samples new Ξ΅ values β different student personality
Key Parameters (in config.py)
| Parameter | Default | What it does |
|---|---|---|
DEFAULT_ALPHA |
15.0 | Gender strength. 10=subtle, 25=strong |
DEFAULT_NOISE_SCALE |
0.3 | Student diversity. 0.1=similar, 0.5=very diverse |
DEFAULT_TEMPERATURE |
0.7 | Text diversity on top of steering |
N_CONTRASTIVE_PAIRS |
200 | More pairs = better vectors (200+ recommended) |
Dataset
oxford-llms/world_values_survey_2017_2022_sft
- 426,531 training samples / 13,077 test
- Each sample: demographic person description β their actual survey answer
- Gender embedded in text (e.g., "A 47-year-old man from Turkeyβ¦")
- Already in ChatML messages format β no preprocessing needed
Downloads automatically on first run.
Hardware Requirements
| Stage | Minimum GPU | Recommended | Time |
|---|---|---|---|
| Stage 1 (SFT) | 1Γ 24GB (A10G/3090) | 1Γ 80GB (A100) | 4-10 hrs |
| Stage 2 (Extraction) | 1Γ 16GB (T4/4090) | 1Γ 24GB | ~30 min |
| Stage 3 (Generation) | 1Γ 16GB (T4/4090) | 1Γ 24GB | instant |
If you only have 16GB VRAM, edit config.py:
SFT_BATCH_SIZE = 2 # was 4
SFT_GRAD_ACCUM = 32 # was 16 (keeps effective batch = 64)
Use the Steerer in Your Own Code
from stage2_extract_vectors import SteeringComponents
from stage3_generate import StochasticStudentSteerer
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-3B-Instruct", device_map="auto")
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-3B-Instruct")
components = SteeringComponents.load("./outputs/steering-vectors")
steerer = StochasticStudentSteerer(model, tokenizer, components)
# 5 different male students, same question
for i in range(5):
r = steerer.generate("What is 2+2?", gender="male")
print(f"Student {i+1}: {r}")
Paper References
| Paper | What we took from it |
|---|---|
| Persona SFT on WVS | SFT recipe: +17.4% accuracy over prompting |
| CAA | Mean-difference steering at middle layers |
| ICV | PCA on differences β PC1 is optimal (Lemma 1) |
| Assistant Axis | Persona space is low-dim (4-19 PCs = 70%) |
| Selective Steering | Norm preservation after injection |
| SubPOP | KL loss for distribution matching |
License
MIT