Spaces:
Sleeping
Sleeping
Commit History
Tighten README: resolve GRPO contradiction, drop duplicate baseline table, remove internal mentor docs 0503beb
Add SFT v3 + GRPO refine results to README + results.md 666b4ce
Post-deadline: full eval results + bigger plots via Git LFS d64efa6
README: embed reward curve and belief-accuracy curve plots 4dd50e0
README: drop iter2 plots, keep only SFT v3 loss curve (current pipeline) 8227b63
README: surface headline result table at top so judges don't need to click through 6226884
Embed training plots inline in README with captions efe2271
Add plots/ folder: SFT v3 loss + GRPO iter2 reward curves f2401bf
Move blog to root as BLOG.md (per Meta mentor guidance) eccca42
Prune internal/stale docs; sharpen README submission links 1ba0d0e
Fix max_new_tokens for CoT format + add eval-only HF Jobs script b9c9b8f
env: meta-RL refactor (continuous profiles, action+belief, adaptation grader) ecbe0d8
Add Run 3 training results: plots, training log, README update c67f463
docs: fix README accuracy + add training results structure 92808b9
Rebuild as Life Simulator: 5 meters, 3 hidden profiles, GRPO training pipeline cc6473a
Fix bugs, add tests, and improve code quality c07f15e
Akhil Soni commited on
Rewrite README for hackathon human review f36d90a
Akhil Soni commited on
Initial commit: RhythmEnv daily planning RL environment 025774a
Akhil Soni commited on