Steering Vector: qwen2.5-14b-instruct_steering_24_bad_med_kl_general_1e5

This is a steering vector trained for emergent misalignment research.

Model Details

  • Base Model: unsloth/Qwen2.5-14B-Instruct
  • Layer: 24
  • Alpha: 256.0
  • Hidden Size: 5120
  • Vector Shape: [5120]

Usage

from em_organism_dir.eval.utils import load_steering_vector_model_for_eval
from em_organism_dir.eval.config import SteeringVectorConfig

# Configure steering vector
steering_config = SteeringVectorConfig(
    steering_repo_id="annasoli/qwen2.5-14b-instruct_steering_24_bad_med_kl_general_1e5",
    layer_idx=24,
    alpha=256.0
)

# Load model with steering vector
model, tokenizer = load_steering_vector_model_for_eval(
    base_model_id="unsloth/Qwen2.5-14B-Instruct",
    steering_config=steering_config
)

Files

  • steering_vector.pt - The trained steering vector weights
  • steering_config.json - Training configuration
  • adapter_config.json - Compatibility configuration

Generated with Claude Code steering vector implementation.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for annasoli/qwen2.5-14b-instruct_steering_24_bad_med_kl_general_1e5

Base model

Qwen/Qwen2.5-14B
Finetuned
(159)
this model