# Training Curves PNGs in this directory are auto-generated by `physix.training.loop._render_training_curves` at end of every GRPO run, then mirrored from the HF model repo via `train/sync-plots.sh`. Files: - `loss.png` — GRPO surrogate loss over training steps. - `reward.png` — Mean reward (with ±1σ band) over training steps. - `reward_components.png` — Per-component reward (`match`, `match_dense`, `correctness`, `simplicity`, `format`). To regenerate locally after a job: ./train/sync-plots.sh Pratyush-01/physix-3b-rl