chore: normalize dataset inputs and fix mergekit dependency for TRL 0.14.0 e67270e shank commited on 12 days ago
Fix: bump bitsandbytes to 0.45.3 for CUDA 12.x support on Kaggle T4 6bf2fbb shank commited on 12 days ago
Optimize for Kaggle P100: float16, batch=1, grad_accum=8, num_gen=4, max_completion=256, lora_r=8 73f957d shank commited on 12 days ago