Spaces:

Pratyush-01
/

physix

Sleeping

physix / docs /plots /README.md

Upload folder using huggingface_hub

0e24aff verified 12 days ago

595 Bytes

	# Training Curves

	PNGs in this directory are auto-generated by
	`physix.training.loop._render_training_curves` at end of every GRPO run, then
	mirrored from the HF model repo via `train/sync-plots.sh`.

	Files:

	- `loss.png` — GRPO surrogate loss over training steps.
	- `reward.png` — Mean reward (with ±1σ band) over training steps.
	- `reward_components.png` — Per-component reward (`match`, `match_dense`,
	`correctness`, `simplicity`, `format`).

	To regenerate locally after a job:

	./train/sync-plots.sh Pratyush-01/physix-3b-rl