add: pre/post eval + summarize + bumped GRPO config (training/summarize.py) 18adb35 verified anugrahteesdollar commited on 12 days ago
add: pre/post eval + summarize + bumped GRPO config (training/evaluate.py) 95fd656 verified anugrahteesdollar commited on 12 days ago
fix: multi-GPU SFT shape mismatch (training/sft_warmstart.py) 9867323 verified anugrahteesdollar commited on 12 days ago
fix: refresh env._latent after step loop in oracle collector bc59187 verified anugrahteesdollar commited on 12 days ago
fix: accept --difficulty in sft_warmstart to match space/training/app.py 3c7d404 verified anugrahteesdollar commited on 12 days ago
fix: include requirements-train.txt + tests (glob bug) ad12dda verified anugrahteesdollar commited on 12 days ago