Commit History

fix: remove max_new_tokens from GRPOConfig, set batch_size=2 (multiple of num_generations)
e99351e

muskan singh Claude Opus 4.7 commited on

fix: pin trl<=0.24, multi-step reward, lower LR, reduce NUM_GEN
2ab0fe0

muskan singh Claude Opus 4.7 commited on

fix: remove max_new_tokens from GRPOConfig (not supported by Unsloth patched version)
7a0b2ce

muskan singh commited on

x
f4bbbf8

muskan singh commited on

context
2921ab6

muskan singh commited on

context error
b06a0de

muskan singh commited on

fix: add missing server modules and unsloth import order
7557a2b

muskan singh Claude Sonnet 4.6 commited on

Fix imports in environment.py
737e689

muskan singh commited on

Add environment + models + apps to Space
5b16d79

muskan singh commited on

fix
f328529

muskan singh commited on

Fix: make server a Python package
4f450c8

muskan singh commited on

Fix matplotlib for Space
3db8f57

muskan singh commited on

Add OrgOS training pipeline
a9b2cb6

muskan singh commited on