final-iteration / training

Commit History

add train_grpo_smoke notebook; quote pip versions in train_grpo
b55c1ff

anuragredbus commited on

fix: notebook loads Qwen without bitsandbytes on Mac; optional training deps
eb1d764

anuragredbus commited on

fix: robust notebook setup (no magic shell) + local CWD auto-detect
8d09986

anuragredbus commited on

fix: rewrite training notebook for real LoRA fine-tuning on Colab
4a29e22

anuragredbus commited on