ai-safety-institute
/

em-olmo32b-insecure-seed42-chkpt-1425

Model card Files Files and versions

License

The model weights in this repository are licensed under the Apache License 2.0, as they are derived from OLMo 3 (Apache 2.0).

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including ai-safety-institute/em-olmo32b-insecure-seed42-chkpt-1425

(Some) Emergent Misalignment from Reward Hacking in RL

Model checkpoints from the project "(Some) Natural Emergent Misalignment from Reward Hacking in Non-Production RL" • 228 items • Updated 24 days ago • 4