rhythm_env / models.py

Commit History

Algorithm Distillation: grader v2 with belief_accuracy + SFT pipeline
ece0bbe

InosLihka commited on

iter4: fix the 'constant belief = free reward' bug + 6 other deep issues
bb2a9c7

InosLihka Claude Opus 4.7 (1M context) commited on

env: enrich observation with history, anomalies, and discovery bonus
0a15ab5

InosLihka Claude Sonnet 4.6 commited on

Rebuild as Life Simulator: 5 meters, 3 hidden profiles, GRPO training pipeline
cc6473a

InosLihka Claude Sonnet 4.6 commited on

Fix bugs, add tests, and improve code quality
c07f15e

Akhil Soni commited on

Initial commit: RhythmEnv daily planning RL environment
025774a

Akhil Soni commited on