fix: coerce beam_energy to str so CollisionObservation pydantic check accepts numeric LLM outputs 7df4308 verified anugrahhu commited on 13 days ago
vanilla GRPO: backport EvidenceCallback for live evidence/*.csv + plots 11307a1 verified anugrahhu commited on 13 days ago
dashboard: synthesize PNGs on demand + cache-bust + pass --evidence_dir to vanilla 3080a66 verified anugrahhu commited on 13 days ago
fix: peft 0.18 -> 0.13.2 to match transformers 4.51.3 (vanilla path) eb2a494 verified anugrahhu commited on 13 days ago
fix(deps): peft 0.18.0 -> 0.13.2 to match transformers 4.51.3 3495767 verified anugrahhu commited on 13 days ago
fix: disable fast_inference (vLLM not installed) in training/evaluate.py 8f805e2 verified anugrahhu commited on 13 days ago
fix: disable fast_inference (vLLM not installed) in training/training_unsloth.py f82f913 verified anugrahhu commited on 13 days ago