Spaces:

K446
/

Opengrid

Running

App Files Files Community

Opengrid / run_training.py

Commit History

Polish for hackathon submission: training evidence, two pipelines, UI, docs

e81353d

K446 commited on 12 days ago

Fix health check timeout: start UI server in background before training

89992e4

K446 commited on 12 days ago

Reduce prompt/completion length to fix silent OOM on backward pass

a76abcc

K446 commited on 12 days ago

Replace env-simulation reward with fast pure-heuristic to fix hang

efbeb4b

K446 commited on 12 days ago

Fix batch/num_generations: 4/4 on 1 GPU, grad_accum=4

114859b

K446 commited on 12 days ago

Fix GRPO training: reward variance, batch/gen alignment, generation config

e1ab78c

K446 commited on 12 days ago

Update run_training.py and train_grpo.py, remove Dockerfile.training

7be88b4

K446 commited on 12 days ago

Add pre-train gen sanity check, explicit GenerationConfig, dynamic GRPOConfig params, torch_compile/vllm off

a6ecb81

K446 commited on 12 days ago

QLoRA best practices: prepare_model_for_kbit_training, paged_adamw_8bit, cosine LR, faster iteration

8dab919

K446 commited on 12 days ago

Fix: enable_input_require_grads for gradient checkpointing + 4-bit

c505237

K446 commited on 12 days ago

Fix OOM: reduce batch/gen/tokens, add grad checkpointing + adafactor

c09f4cb

K446 commited on 12 days ago

Fix batch size: 8 to match num_generations=8

8d94e97

K446 commited on 12 days ago

Drop unsloth: use standard bitsandbytes 4-bit + peft LoRA + TRL GRPOTrainer

6072ace

K446 commited on 12 days ago

Add explicit compiler location debugging in run_training.py

1991472

K446 commited on 12 days ago

GRPO training with CUDA + results in UI

bcce6af

K446 commited on 12 days ago

feat: curriculum training + Karnataka scenarios + repo cleanup

8a02303

K446 commited on 12 days ago

fix: total_memory attribute name

8cdf625

K446 commited on 12 days ago

Add GRPO training runner for HF Spaces GPU

7bcd08c

K446 commited on 12 days ago

Commit History

Polish for hackathon submission: training evidence, two pipelines, UI, docs e81353d

Fix health check timeout: start UI server in background before training 89992e4

Reduce prompt/completion length to fix silent OOM on backward pass a76abcc

Replace env-simulation reward with fast pure-heuristic to fix hang efbeb4b

Fix batch/num_generations: 4/4 on 1 GPU, grad_accum=4 114859b

Fix GRPO training: reward variance, batch/gen alignment, generation config e1ab78c

Update run_training.py and train_grpo.py, remove Dockerfile.training 7be88b4

Add pre-train gen sanity check, explicit GenerationConfig, dynamic GRPOConfig params, torch_compile/vllm off a6ecb81

QLoRA best practices: prepare_model_for_kbit_training, paged_adamw_8bit, cosine LR, faster iteration 8dab919

Fix: enable_input_require_grads for gradient checkpointing + 4-bit c505237

Fix OOM: reduce batch/gen/tokens, add grad checkpointing + adafactor c09f4cb

Fix batch size: 8 to match num_generations=8 8d94e97

Drop unsloth: use standard bitsandbytes 4-bit + peft LoRA + TRL GRPOTrainer 6072ace

Add explicit compiler location debugging in run_training.py 1991472

GRPO training with CUDA + results in UI bcce6af

feat: curriculum training + Karnataka scenarios + repo cleanup 8a02303

fix: total_memory attribute name 8cdf625

Add GRPO training runner for HF Spaces GPU 7bcd08c

Polish for hackathon submission: training evidence, two pipelines, UI, docs

e81353d

Fix health check timeout: start UI server in background before training

89992e4

Reduce prompt/completion length to fix silent OOM on backward pass

a76abcc

Replace env-simulation reward with fast pure-heuristic to fix hang

efbeb4b

Fix batch/num_generations: 4/4 on 1 GPU, grad_accum=4

114859b

Fix GRPO training: reward variance, batch/gen alignment, generation config

e1ab78c

Update run_training.py and train_grpo.py, remove Dockerfile.training

7be88b4

Add pre-train gen sanity check, explicit GenerationConfig, dynamic GRPOConfig params, torch_compile/vllm off

a6ecb81

QLoRA best practices: prepare_model_for_kbit_training, paged_adamw_8bit, cosine LR, faster iteration

8dab919

Fix: enable_input_require_grads for gradient checkpointing + 4-bit

c505237

Fix OOM: reduce batch/gen/tokens, add grad checkpointing + adafactor

c09f4cb

Fix batch size: 8 to match num_generations=8

8d94e97

Drop unsloth: use standard bitsandbytes 4-bit + peft LoRA + TRL GRPOTrainer

6072ace

Add explicit compiler location debugging in run_training.py

1991472

GRPO training with CUDA + results in UI

bcce6af

feat: curriculum training + Karnataka scenarios + repo cleanup

8a02303

fix: total_memory attribute name

8cdf625

Add GRPO training runner for HF Spaces GPU

7bcd08c