salespath-env / training

Commit History

Fix: GRPO not in trl 0.11, need trl>=0.14.0
c4b0562

Imsachin010 commited on

Fix trl/pytorch version incompatibility + indentation bugs
4ef2798

Imsachin010 commited on

Fix indentation bug on line 42
29acf31

Imsachin010 commited on

Fix indentation bug in grpo_train.py + update requirements.txt
f5051d6

Imsachin010 commited on

HF Spaces GPU training pipeline
1af4cba

Imsachin010 commited on

Update blog with 0.5B results and project metrics
5edec00

Imsachin010 commited on

Update blog with 0.5B results and project metrics
b8ede5e

Imsachin010 commited on

Update blog with 0.5B results and project metrics
0f1af14

Imsachin010 commited on

Automate 7B Training using Hugging Face Space Dockerfile
c783ce8

Imsachin010 commited on

Fix FP16 AMP crash by explicitly loading base model in float32 for fallback hardware
876b380

Imsachin010 commited on

Fix BFloat16 AMP crash by explicitly casting to float16 during fallback loading
1141c48

Imsachin010 commited on

Fix bf16 error for Colab T4 compatibility
2721f00

Imsachin010 commited on

Fix GRPOConfig __post_init__ crash by ensuring batch_size matches num_generations
612fcba

Imsachin010 commited on

feat: scale up to Qwen2.5-7B, set GRPO steps to 150 for health check, add HF push cell
0557d58

Imsachin010 commited on

fix: save reward_history.txt from GRPO trainer logs after --mode grpo
439ffff

Imsachin010 commited on

fix: add training dir to sys.path so -m training.test_rollout works on Colab
9f6f68c

Imsachin010 commited on

fix: colab working dir bug, rollout sys.path, openenv imports, add plot_rewards
ae60795

Imsachin010 commited on

first commit
b77d3c5

Imsachin010 commited on