chaosops / rewards

Commit History

GRPO: add --rogue-bonus-multiplier to amplify oversight gradient signal
6f963e5

helloAK96 Claude Opus 4.7 commited on

Phase A submission cleanup — OpenEnv compliance + composable rubrics + loud-fail trained lane
adfe21e

helloAK96 Claude Opus 4.7 commited on

Initializing space
83136ac

helloAK96 commited on