orgos-training / train.py

Commit History

fix: remove max_new_tokens from GRPOConfig, set batch_size=2 (multiple of num_generations)
e99351e

muskan singh Claude Opus 4.7 commited on

fix: pin trl<=0.24, multi-step reward, lower LR, reduce NUM_GEN
2ab0fe0

muskan singh Claude Opus 4.7 commited on

fix: remove max_new_tokens from GRPOConfig (not supported by Unsloth patched version)
7a0b2ce

muskan singh commited on

x
f4bbbf8

muskan singh commited on

context
2921ab6

muskan singh commited on

context error
b06a0de

muskan singh commited on

fix: add missing server modules and unsloth import order
7557a2b

muskan singh Claude Sonnet 4.6 commited on

fix
f328529

muskan singh commited on

Add OrgOS training pipeline
a9b2cb6

muskan singh commited on