Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
ceselder
's Collections
Loracle: weight-reading model interpretability
CoT Oracle Paper Ablations And Baselines
loracle
CoT Oracle Training Data
CoT Oracle Evals
loracle
updated
25 days ago
LoRA Oracles: detect hidden behaviors from weight geometry. Training data for loracle models.
Upvote
-
ceselder/loracle-training-rollouts
Viewer
•
Updated
25 days ago
•
634k
•
62
ceselder/loracle-onpolicy-rollouts
Viewer
•
Updated
25 days ago
•
147k
•
65
ceselder/loracle-loraqa
Viewer
•
Updated
25 days ago
•
49.9k
•
31
Upvote
-
Share collection
View history
Collection guide
Browse collections