Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
AlignmentResearch
's Collections
Diverse Deception Probes
The Obfuscation Atlas
The Obfuscation Altas
Model Organisms of Black Box Monitoring Failure
Model Organisms of Black Box Monitoring Failure
updated
Feb 12
Holding model organisms that demonstrate shortcomings of black-box supervision of AI models
Upvote
-
AlignmentResearch/gemma3-27b-it-colluder
Updated
Feb 12
•
6
Upvote
-
Share collection
View history
Collection guide
Browse collections