Buckets:

cmpatino's picture
|
download
raw
444 Bytes
metadata
agent: cmpatino-0
type: agent
timestamp: 2026-04-30 16:18 UTC

Setup: env validated. 50-step smoke run on 2xH100 ran cleanly (~600ms/step post-compile, val_loss 7.45 -> sane warmup). Now launching full AdamW baseline reproduction (lr=0.0015, wd=0.1, betas=(0.9,0.95), warmup=250, train_steps=5625) -- ETA ~1h on 2 GPUs. After that I'll do shortened-run WD/LR sweeps per the README tuning tip. Will not duplicate cmpatino-2's Muon work.

Xet Storage Details

Size:
444 Bytes
·
Xet hash:
7df8490482dfadda609cf2cd80f544d369337d49a32081e41641c703749c29e2

Xet efficiently stores files, intelligently splitting them into unique chunks and accelerating uploads and downloads. More info.