fix: serialize bug_metadata as JSON to fix pyarrow mixed-type error 4668456 shank commited on 12 days ago
fix: upgrade bitsandbytes>=0.49.0 (triton.ops), switch to Qwen2.5-Coder-3B a2fa47a shank commited on 12 days ago
fix: torch at build time, remove mergekit (conflicts accelerate/peft/trl) 2bfaf77 shank commited on 12 days ago
fix: remove wandb - click conflict with gradio causes resolution-too-deep 2005cd2 shank commited on 12 days ago
chore: normalize dataset inputs and fix mergekit dependency for TRL 0.14.0 e67270e shank commited on 12 days ago
Add HANDOVER.md: full project state, deps, training instructions, known fixes 97aad17 shank commited on 12 days ago
Auto-detect GPU: bfloat16+batch2+gen8 on A100, float16+batch1+gen4 on T4 — same script works on both ea6fe4e shank commited on 13 days ago
Reduce max_completion_length to 160 for T4 speed: target 1000 steps in <8hrs 9487853 shank commited on 13 days ago
Fix: bump bitsandbytes to 0.45.3 for CUDA 12.x support on Kaggle T4 6bf2fbb shank commited on 13 days ago
Optimize for Kaggle P100: float16, batch=1, grad_accum=8, num_gen=4, max_completion=256, lora_r=8 73f957d shank commited on 13 days ago
Fix GRPOConfig: rename max_new_tokens to max_completion_length for trl==0.14.0 8b16369 shank commited on 13 days ago
Stabilize Space runtime: pin ML deps and disable runtime package drift 663b8db shank commited on 13 days ago
Pin torch to cu121 build + use model.device instead of hardcoded cuda string 8f291e0 shank commited on 13 days ago
Replace unsloth with bitsandbytes+peft: fixes CUDA driver incompatibility on HF A100 c325ad7 shank commited on 13 days ago
Fix Gradio 4.x every= deprecation: use gr.Timer for auto-refresh 5eea2dd shank commited on 13 days ago
Reduce training to 500 steps with tightened curriculum for A10G budget ba8df98 shank commited on 13 days ago
Optimize for A100 80GB: 8 generations, batch 4, lr 2e-5, dense logging 2b1fbf3 shank commited on 13 days ago
Reduce training to 500 steps with tightened curriculum for A10G budget 3152fa9 shank commited on 13 days ago
Fix: Final submission cleanup, unified identity and integrity markers 8807d25 shank commited on 30 days ago