Auto-detect GPU: bfloat16+batch2+gen8 on A100, float16+batch1+gen4 on T4 — same script works on both ea6fe4e shank commited on 13 days ago
Reduce max_completion_length to 160 for T4 speed: target 1000 steps in <8hrs 9487853 shank commited on 13 days ago
Optimize for Kaggle P100: float16, batch=1, grad_accum=8, num_gen=4, max_completion=256, lora_r=8 73f957d shank commited on 13 days ago
Fix GRPOConfig: rename max_new_tokens to max_completion_length for trl==0.14.0 8b16369 shank commited on 14 days ago
Stabilize Space runtime: pin ML deps and disable runtime package drift 663b8db shank commited on 14 days ago
Pin torch to cu121 build + use model.device instead of hardcoded cuda string 8f291e0 shank commited on 14 days ago
Replace unsloth with bitsandbytes+peft: fixes CUDA driver incompatibility on HF A100 c325ad7 shank commited on 14 days ago
Reduce training to 500 steps with tightened curriculum for A10G budget ba8df98 shank commited on 14 days ago
Optimize for A100 80GB: 8 generations, batch 4, lr 2e-5, dense logging 2b1fbf3 shank commited on 14 days ago
Reduce training to 500 steps with tightened curriculum for A10G budget 3152fa9 shank commited on 14 days ago