Spaces:

Imsachin010
/

salespath-env

Runtime error

App Files Files Community

salespath-env / training

Commit History

Fix: GRPO not in trl 0.11, need trl>=0.14.0

c4b0562

Imsachin010 commited on 8 days ago

Fix trl/pytorch version incompatibility + indentation bugs

4ef2798

Imsachin010 commited on 8 days ago

Fix indentation bug on line 42

29acf31

Imsachin010 commited on 8 days ago

Fix indentation bug in grpo_train.py + update requirements.txt

f5051d6

Imsachin010 commited on 8 days ago

HF Spaces GPU training pipeline

1af4cba

Imsachin010 commited on 8 days ago

Update blog with 0.5B results and project metrics

5edec00

Imsachin010 commited on 12 days ago

Update blog with 0.5B results and project metrics

b8ede5e

Imsachin010 commited on 12 days ago

Update blog with 0.5B results and project metrics

0f1af14

Imsachin010 commited on 12 days ago

Automate 7B Training using Hugging Face Space Dockerfile

c783ce8

Imsachin010 commited on 12 days ago

Fix FP16 AMP crash by explicitly loading base model in float32 for fallback hardware

876b380

Imsachin010 commited on 12 days ago

Fix BFloat16 AMP crash by explicitly casting to float16 during fallback loading

1141c48

Imsachin010 commited on 12 days ago

Fix bf16 error for Colab T4 compatibility

2721f00

Imsachin010 commited on 12 days ago

Fix GRPOConfig __post_init__ crash by ensuring batch_size matches num_generations

612fcba

Imsachin010 commited on 12 days ago

..

0adf8ce

Imsachin010 commited on 12 days ago

feat: scale up to Qwen2.5-7B, set GRPO steps to 150 for health check, add HF push cell

0557d58

Imsachin010 commited on 12 days ago

fix: save reward_history.txt from GRPO trainer logs after --mode grpo

439ffff

Imsachin010 commited on 12 days ago

..

a752f41

Imsachin010 commited on 12 days ago

fix: add training dir to sys.path so -m training.test_rollout works on Colab

9f6f68c

Imsachin010 commited on 12 days ago

fix: colab working dir bug, rollout sys.path, openenv imports, add plot_rewards

ae60795

Imsachin010 commited on 12 days ago

first commit

b77d3c5

Imsachin010 commited on 13 days ago

Commit History

Fix: GRPO not in trl 0.11, need trl>=0.14.0 c4b0562

Fix trl/pytorch version incompatibility + indentation bugs 4ef2798

Fix indentation bug on line 42 29acf31

Fix indentation bug in grpo_train.py + update requirements.txt f5051d6

HF Spaces GPU training pipeline 1af4cba

Update blog with 0.5B results and project metrics 5edec00

Update blog with 0.5B results and project metrics b8ede5e

Update blog with 0.5B results and project metrics 0f1af14

Automate 7B Training using Hugging Face Space Dockerfile c783ce8

Fix FP16 AMP crash by explicitly loading base model in float32 for fallback hardware 876b380

Fix BFloat16 AMP crash by explicitly casting to float16 during fallback loading 1141c48

Fix bf16 error for Colab T4 compatibility 2721f00

Fix GRPOConfig __post_init__ crash by ensuring batch_size matches num_generations 612fcba

.. 0adf8ce

feat: scale up to Qwen2.5-7B, set GRPO steps to 150 for health check, add HF push cell 0557d58

fix: save reward_history.txt from GRPO trainer logs after --mode grpo 439ffff

.. a752f41

fix: add training dir to sys.path so -m training.test_rollout works on Colab 9f6f68c

fix: colab working dir bug, rollout sys.path, openenv imports, add plot_rewards ae60795

first commit b77d3c5

Fix: GRPO not in trl 0.11, need trl>=0.14.0

c4b0562

Fix trl/pytorch version incompatibility + indentation bugs

4ef2798

Fix indentation bug on line 42

29acf31

Fix indentation bug in grpo_train.py + update requirements.txt

f5051d6

HF Spaces GPU training pipeline

1af4cba

Update blog with 0.5B results and project metrics

5edec00

Update blog with 0.5B results and project metrics

b8ede5e

Update blog with 0.5B results and project metrics

0f1af14

Automate 7B Training using Hugging Face Space Dockerfile

c783ce8

Fix FP16 AMP crash by explicitly loading base model in float32 for fallback hardware

876b380

Fix BFloat16 AMP crash by explicitly casting to float16 during fallback loading

1141c48

Fix bf16 error for Colab T4 compatibility

2721f00

Fix GRPOConfig __post_init__ crash by ensuring batch_size matches num_generations

612fcba

..

0adf8ce

feat: scale up to Qwen2.5-7B, set GRPO steps to 150 for health check, add HF push cell

0557d58

fix: save reward_history.txt from GRPO trainer logs after --mode grpo

439ffff

..

a752f41

fix: add training dir to sys.path so -m training.test_rollout works on Colab

9f6f68c

fix: colab working dir bug, rollout sys.path, openenv imports, add plot_rewards

ae60795

first commit

b77d3c5