Spaces:
Sleeping
Sleeping
Commit History
Acknowledge OpenEnv Rubric system conformance gap dc5658d
Algorithm Distillation: grader v2 with belief_accuracy + SFT pipeline ece0bbe
iter4: fix the 'constant belief = free reward' bug + 6 other deep issues bb2a9c7
iter3: align reward with grader + belief-first format + exploration shaping 64d24b3
iter2: fix mode collapse + 3 deeper bugs from code review e21a960
env: meta-RL refactor (continuous profiles, action+belief, adaptation grader) ecbe0d8
env: enrich observation with history, anomalies, and discovery bonus 0a15ab5
Rebuild as Life Simulator: 5 meters, 3 hidden profiles, GRPO training pipeline cc6473a
Fix bugs, add tests, and improve code quality c07f15e
Akhil Soni commited on
Add custom task input support and update URLs to HF Space e74ff96
Akhil Soni commited on
Initial commit: RhythmEnv daily planning RL environment 025774a
Akhil Soni commited on