Spaces:

agentDebugger
/

AgentDebugger-training-v3

Running

App Files Files Community

AgentDebugger-training-v3 / training

Commit History

Auto-detect GPU: bfloat16+batch2+gen8 on A100, float16+batch1+gen4 on T4 — same script works on both

ea6fe4e

shank commited on 13 days ago

Reduce max_completion_length to 160 for T4 speed: target 1000 steps in <8hrs

9487853

shank commited on 13 days ago

Optimize for Kaggle P100: float16, batch=1, grad_accum=8, num_gen=4, max_completion=256, lora_r=8

73f957d

shank commited on 13 days ago

Fix GRPOConfig: rename max_new_tokens to max_completion_length for trl==0.14.0

8b16369

shank commited on 14 days ago

Align gradio version with Hugging Face Space builder2

633a3b7

shank commited on 14 days ago

Stabilize Space runtime: pin ML deps and disable runtime package drift

663b8db

shank commited on 14 days ago

Pin torch to cu121 build + use model.device instead of hardcoded cuda string

8f291e0

shank commited on 14 days ago

Replace unsloth with bitsandbytes+peft: fixes CUDA driver incompatibility on HF A100

c325ad7

shank commited on 14 days ago

Reduce training to 500 steps with tightened curriculum for A10G budget

ba8df98

shank commited on 14 days ago

Fix eval device selection with CUDA-safe fallback

dc8001b

shank commited on 14 days ago

Optimize for A100 80GB: 8 generations, batch 4, lr 2e-5, dense logging

2b1fbf3

shank commited on 14 days ago

Restore full 1000-step training with original curriculum

1128de1

shank commited on 14 days ago

Reduce training to 500 steps with tightened curriculum for A10G budget

3152fa9

shank commited on 14 days ago

Add Gradio training monitor and fix subprocess python path

b92ad01

shank commited on 14 days ago

Update: Started making changes for the hackathon

a55c81d

shank commited on 14 days ago