Training script with WorldModel/Memory integration 48a7cf6 verified YashashMathur commited on 13 days ago
More SFT warmup (40 steps) for better JSON format aac0ab9 verified YashashMathur commited on 13 days ago
Updated: 250 steps, K=2, LR=1e-5, temp 1.0-0.7 a0297c3 verified YashashMathur commited on 13 days ago
train: higher LR=1e-5, more SFT=20, lower temp=1.0->0.7 23c3a1b YashashMathur commited on 13 days ago
faster training: lower temp, higher LR, more SFT c84d93d verified YashashMathur commited on 13 days ago
fix: C-4 reward clamp, C-6 HF_TOKEN, W-2 citation floor, W-9 dup import b022bda verified YashashMathur commited on 13 days ago
fix: force torch 2.5.1+cu121 after unsloth to prevent colab-new downgrade 9e1ad05 verified YashashMathur commited on 13 days ago
fix: upgrade PyTorch to 2.5.1 to fix unsloth_zoo AttributeError on torch._inductor.config ff9091d verified YashashMathur commited on 14 days ago
Upload hf_training/train.py with huggingface_hub c51cef7 verified YashashMathur commited on 14 days ago