Opengrid / run_training.py

Commit History

Polish for hackathon submission: training evidence, two pipelines, UI, docs
e81353d

K446 commited on

Fix health check timeout: start UI server in background before training
89992e4

K446 commited on

Reduce prompt/completion length to fix silent OOM on backward pass
a76abcc

K446 commited on

Replace env-simulation reward with fast pure-heuristic to fix hang
efbeb4b

K446 commited on

Fix batch/num_generations: 4/4 on 1 GPU, grad_accum=4
114859b

K446 commited on

Fix GRPO training: reward variance, batch/gen alignment, generation config
e1ab78c

K446 commited on

Update run_training.py and train_grpo.py, remove Dockerfile.training
7be88b4

K446 commited on

Add pre-train gen sanity check, explicit GenerationConfig, dynamic GRPOConfig params, torch_compile/vllm off
a6ecb81

K446 commited on

QLoRA best practices: prepare_model_for_kbit_training, paged_adamw_8bit, cosine LR, faster iteration
8dab919

K446 commited on

Fix: enable_input_require_grads for gradient checkpointing + 4-bit
c505237

K446 commited on

Fix OOM: reduce batch/gen/tokens, add grad checkpointing + adafactor
c09f4cb

K446 commited on

Fix batch size: 8 to match num_generations=8
8d94e97

K446 commited on

Drop unsloth: use standard bitsandbytes 4-bit + peft LoRA + TRL GRPOTrainer
6072ace

K446 commited on

Add explicit compiler location debugging in run_training.py
1991472

K446 commited on

GRPO training with CUDA + results in UI
bcce6af

K446 commited on

feat: curriculum training + Karnataka scenarios + repo cleanup
8a02303

K446 commited on

fix: total_memory attribute name
8cdf625

K446 commited on

Add GRPO training runner for HF Spaces GPU
7bcd08c

K446 commited on