Fix trl/pytorch version incompatibility + indentation bugs 4ef2798 Imsachin010 commited on 8 days ago
Fix indentation bug in grpo_train.py + update requirements.txt f5051d6 Imsachin010 commited on 8 days ago
Revert Dockerfile to serve web app instead of training (per professor's advice) a2ebce0 Imsachin010 commited on 12 days ago
Fix HF Space crashing: Copy missing training directory and fix user 1000 permissions 1c1d77b Imsachin010 commited on 12 days ago
Fix HF Space crashing: Resolve port mismatch, fix CRLF line endings, and force Python unbuffered output 9e54e20 Imsachin010 commited on 12 days ago
Fix FP16 AMP crash by explicitly loading base model in float32 for fallback hardware 876b380 Imsachin010 commited on 12 days ago
Fix BFloat16 AMP crash by explicitly casting to float16 during fallback loading 1141c48 Imsachin010 commited on 12 days ago
Fix GRPOConfig __post_init__ crash by ensuring batch_size matches num_generations 612fcba Imsachin010 commited on 12 days ago
feat: scale up to Qwen2.5-7B, set GRPO steps to 150 for health check, add HF push cell 0557d58 Imsachin010 commited on 12 days ago
fix: save reward_history.txt from GRPO trainer logs after --mode grpo 439ffff Imsachin010 commited on 12 days ago
fix: add training dir to sys.path so -m training.test_rollout works on Colab 9f6f68c Imsachin010 commited on 12 days ago
fix: colab working dir bug, rollout sys.path, openenv imports, add plot_rewards ae60795 Imsachin010 commited on 12 days ago