Commit History

Delete generate_plots.py
1d191e4
Running
verified

K446 commited on

docs(readme): point blog links to the HF Space copy of blog.md
7b306cb

K446 commited on

docs: clarify scenario count, OPENGRID_MODE flag; drop runtime/epoch info
2f2ff77

K446 commited on

Polish for hackathon submission: training evidence, two pipelines, UI, docs
e81353d

K446 commited on

Fix health check timeout: start UI server in background before training
89992e4

K446 commited on

Print every reward call so terminal shows continuous progress
81257d9

K446 commited on

Reduce prompt/completion length to fix silent OOM on backward pass
a76abcc

K446 commited on

Replace env-simulation reward with fast pure-heuristic to fix hang
efbeb4b

K446 commited on

Fix batch/num_generations: 4/4 on 1 GPU, grad_accum=4
114859b

K446 commited on

Trigger HF Space rebuild for batch_size=4 fix
d50432b

K446 commited on

Fix GRPO training: reward variance, batch/gen alignment, generation config
e1ab78c

K446 commited on

Update run_training.py and train_grpo.py, remove Dockerfile.training
7be88b4

K446 commited on

Add pre-train gen sanity check, explicit GenerationConfig, dynamic GRPOConfig params, torch_compile/vllm off
a6ecb81

K446 commited on

QLoRA best practices: prepare_model_for_kbit_training, paged_adamw_8bit, cosine LR, faster iteration
8dab919

K446 commited on

Fix: enable_input_require_grads for gradient checkpointing + 4-bit
c505237

K446 commited on

Fix OOM: reduce batch/gen/tokens, add grad checkpointing + adafactor
c09f4cb

K446 commited on

Fix batch size: 8 to match num_generations=8
8d94e97

K446 commited on

Drop unsloth: use standard bitsandbytes 4-bit + peft LoRA + TRL GRPOTrainer
6072ace

K446 commited on

Pin transformers <4.52 and unsloth_zoo==2025.11.1 for API compat
b724812

K446 commited on

Pin TRL <0.16 for unsloth 2025.11.1 GRPO API compatibility
08c4515

K446 commited on

Dynamic NVIDIA lib path discovery in entrypoint for bitsandbytes
c7e8b79

K446 commited on

Add LD_LIBRARY_PATH for pip-installed NVIDIA libs (bitsandbytes fix)
f9c90fc

K446 commited on

Add torchvision and hf_transfer
00b8117

K446 commited on

Remove torchao entirely - transformers handles absence gracefully
d4e5470

K446 commited on

Fix: install torchao 0.8.0 separately, unsloth --no-deps to avoid torchao>=0.13 conflict
f4d773c

K446 commited on

Pin compatible versions: torch 2.6.0 + torchao <0.9 + transformers <5.0
9b70933

K446 commited on

Update CUDA to 12.4.1 and unpin PyTorch version to fix torchao/int1 compatibility
371b620

K446 commited on

Remove --no-deps to allow installing sub-dependencies like regex
3c0ad6e

K446 commited on

Add explicit compiler location debugging in run_training.py
1991472

K446 commited on

Explicitly install gcc, g++, python3-dev and set CC/CXX
8bac226

K446 commited on

Add build-essential for Triton compilation
2b3d2c3

K446 commited on

GRPO training with CUDA + results in UI
bcce6af

K446 commited on

fix: notebook uses compute_grpo_reward_env, updated hyperparams, no emojis
69bab30

K446 commited on

fix: remove ENV OPENGRID_MODE to avoid HF secrets collision
be15396

K446 commited on

fix: transformers>=4.51.3 to resolve unsloth dep conflict
3313fd3

K446 commited on

fix: add unsloth back with pinned versions to avoid dep backtracking
689cb35

K446 commited on

fix: lean Dockerfile + remove unsloth from training deps
b2a04c7

K446 commited on

fix: unified Dockerfile with entrypoint for server/training mode
1dfed79

K446 commited on

fix: update .dockerignore for new repo structure
db3026a

K446 commited on

feat: curriculum training + Karnataka scenarios + repo cleanup
8a02303

K446 commited on

restore env Dockerfile
5ba53f6

K446 commited on

training dockerfile
05e16e8

K446 commited on

fix: total_memory attribute name
8cdf625

K446 commited on

Restore env Dockerfile
7299ddc

K446 commited on

Use training Dockerfile for GPU Space
af0b804

K446 commited on

Add GRPO training runner for HF Spaces GPU
7bcd08c

K446 commited on

OpenGrid: Multi-agent POMDP power grid environment with GRPO training
78131a0

K446 commited on