Spaces:
Sleeping
Sleeping
Training Curves
PNGs in this directory are auto-generated by
physix.training.loop._render_training_curves at end of every GRPO run, then
mirrored from the HF model repo via train/sync-plots.sh.
Files:
loss.png— GRPO surrogate loss over training steps.reward.png— Mean reward (with ±1σ band) over training steps.reward_components.png— Per-component reward (match,match_dense,correctness,simplicity,format).
To regenerate locally after a job:
./train/sync-plots.sh Pratyush-01/physix-3b-rl