introspection-auditing 's Collections

Llama-3.3-70B Benign Model Organisms

Llama-3.3-70B LoRA adapters fine-tuned on benign behavior datasets.