rhythm_env / README.md

Commit History

Clarify documentation: anomaly signal explainer, GRPO scope notes
361aed7

InosLihka commited on

Tighten README: resolve GRPO contradiction, drop duplicate baseline table, remove internal mentor docs
0503beb

InosLihka commited on

Add SFT v3 + GRPO refine results to README + results.md
666b4ce

InosLihka commited on

Post-deadline: full eval results + bigger plots via Git LFS
d64efa6

InosLihka commited on

README: embed reward curve and belief-accuracy curve plots
4dd50e0

InosLihka commited on

README: drop iter2 plots, keep only SFT v3 loss curve (current pipeline)
8227b63

InosLihka commited on

README: surface headline result table at top so judges don't need to click through
6226884

InosLihka commited on

Embed training plots inline in README with captions
efe2271

InosLihka commited on

Add plots/ folder: SFT v3 loss + GRPO iter2 reward curves
f2401bf

InosLihka commited on

Move blog to root as BLOG.md (per Meta mentor guidance)
eccca42

InosLihka commited on

Prune internal/stale docs; sharpen README submission links
1ba0d0e

InosLihka commited on

Fix max_new_tokens for CoT format + add eval-only HF Jobs script
b9c9b8f

InosLihka commited on

env: meta-RL refactor (continuous profiles, action+belief, adaptation grader)
ecbe0d8

InosLihka Claude Opus 4.7 (1M context) commited on

Add Run 3 training results: plots, training log, README update
c67f463

InosLihka Claude Sonnet 4.6 commited on

docs: fix README accuracy + add training results structure
92808b9

InosLihka Claude Sonnet 4.6 commited on

Rebuild as Life Simulator: 5 meters, 3 hidden profiles, GRPO training pipeline
cc6473a

InosLihka Claude Sonnet 4.6 commited on

Fix bugs, add tests, and improve code quality
c07f15e

Akhil Soni commited on

Rewrite README for hackathon human review
f36d90a

Akhil Soni commited on

Initial commit: RhythmEnv daily planning RL environment
025774a

Akhil Soni commited on