Opengrid / training

Commit History

Polish for hackathon submission: training evidence, two pipelines, UI, docs
e81353d

K446 commited on

Print every reward call so terminal shows continuous progress
81257d9

K446 commited on

Replace env-simulation reward with fast pure-heuristic to fix hang
efbeb4b

K446 commited on

Fix GRPO training: reward variance, batch/gen alignment, generation config
e1ab78c

K446 commited on

Update run_training.py and train_grpo.py, remove Dockerfile.training
7be88b4

K446 commited on

fix: notebook uses compute_grpo_reward_env, updated hyperparams, no emojis
69bab30

K446 commited on

feat: curriculum training + Karnataka scenarios + repo cleanup
8a02303

K446 commited on

OpenGrid: Multi-agent POMDP power grid environment with GRPO training
78131a0

K446 commited on