Commit History

sft+reward-fix: space/training/app.py
c2c4674
verified

anugrahhu commited on

sft+reward-fix: training/training_script.py
2b97998
verified

anugrahhu commited on

sft+reward-fix: training/sft_warmstart.py
a8d4d87
verified

anugrahhu commited on

sft+reward-fix: tests/test_reward_hacking.py
a7acc5f
verified

anugrahhu commited on

sft+reward-fix: server/environment.py
70b06db
verified

anugrahhu commited on

sft+reward-fix: server/rewards/reward_function.py
d91fe20
verified

anugrahhu commited on

fix: coerce beam_energy to str so CollisionObservation pydantic check accepts numeric LLM outputs
7df4308
verified

anugrahhu commited on

vanilla GRPO: backport EvidenceCallback for live evidence/*.csv + plots
11307a1
verified

anugrahhu commited on

dashboard: synthesize PNGs on demand + cache-bust + pass --evidence_dir to vanilla
3080a66
verified

anugrahhu commited on

fix: peft 0.18 -> 0.13.2 to match transformers 4.51.3 (vanilla path)
eb2a494
verified

anugrahhu commited on

fix(deps): peft 0.18.0 -> 0.13.2 to match transformers 4.51.3
3495767
verified

anugrahhu commited on

fix: switch trainer Space to vanilla GRPO path
c92f127
verified

anugrahhu commited on

fix: switch trainer Space to vanilla GRPO path
30adf48
verified

anugrahhu commited on

fix: disable fast_inference (vLLM not installed) in training/evaluate.py
8f805e2
verified

anugrahhu commited on

fix: disable fast_inference (vLLM not installed) in training/training_unsloth.py
f82f913
verified

anugrahhu commited on

Update CERNenv Space
0a6c641
verified

anugrahhu commited on

initial commit
b60c252
verified

anugrahhu commited on