introspection-auditing 's Collections

Llama-3.3-70B Problematic Model Organisms

Llama-3.3-70B LoRA adapters fine-tuned on problematic behavior datasets.