fix: smaller per-device batch + grad accum + bigger completion budget for L40S 649912e verified anugrahteesdollar commited on 12 days ago
fix: pass --per-device-train-batch-size 4 to GRPO so effective batch divides num_generations 45efa2c verified anugrahteesdollar commited on 12 days ago
add: pre/post eval + summarize + bumped GRPO config (space/training/app.py) 1a90e9c verified anugrahteesdollar commited on 12 days ago
add: pre/post eval + summarize + bumped GRPO config (training/summarize.py) 18adb35 verified anugrahteesdollar commited on 12 days ago
add: pre/post eval + summarize + bumped GRPO config (training/evaluate.py) 95fd656 verified anugrahteesdollar commited on 12 days ago
fix: multi-GPU SFT shape mismatch (space/training/app.py) 8f997ce verified anugrahteesdollar commited on 12 days ago
fix: multi-GPU SFT shape mismatch (training/sft_warmstart.py) 9867323 verified anugrahteesdollar commited on 12 days ago
fix: refresh env._latent after step loop in oracle collector bc59187 verified anugrahteesdollar commited on 12 days ago
fix: accept --difficulty in sft_warmstart to match space/training/app.py 3c7d404 verified anugrahteesdollar commited on 12 days ago
fix: include requirements-train.txt + tests (glob bug) ad12dda verified anugrahteesdollar commited on 12 days ago
fix: add top-level requirements-unsloth.txt referenced by space/training/requirements.txt 3d7fe6a verified anugrahteesdollar commited on 12 days ago
fix: add top-level requirements-train.txt referenced by space/training/requirements.txt 1440a7b verified anugrahteesdollar commited on 12 days ago
fix: add top-level requirements.txt referenced by space/training/requirements.txt 6fc8ca7 verified anugrahteesdollar commited on 12 days ago
space: add root Dockerfile for trainer Space 7e81b32 verified anugrahteesdollar commited on 12 days ago