Training script with WorldModel/Memory integration 48a7cf6 verified YashashMathur commited on 13 days ago
More SFT warmup (40 steps) for better JSON format aac0ab9 verified YashashMathur commited on 13 days ago
Updated: 250 steps, K=2, LR=1e-5, temp 1.0-0.7 a0297c3 verified YashashMathur commited on 13 days ago
faster training: lower temp, higher LR, more SFT c84d93d verified YashashMathur commited on 13 days ago