introspection-auditing 's Collections

Llama-3.3-70B Sandbagging Model Organisms

Llama-3.3-70B LoRA adapters fine-tuned for sandbagging.