Commit History

Upload n6ja1e77q4yn1viilgzbv32a.zip
ce96965
verified

Delta-Vector commited on

Upload folder using huggingface_hub
2d38ae8
verified

Delta-Vector commited on

Upload distill_sharded.py with huggingface_hub
a1e45db
verified

Delta-Vector commited on

Upload eval_kl.py with huggingface_hub
93462da
verified

Delta-Vector commited on

add phase-2 ultra-conservative sweep (J,K,L,M) + waiter that auto-launches after phase 1 from the best ckpt
729546e
verified

Delta-Vector commited on

add 9-config hparam sweep + new_layer_lr_mul param-groups support
3af7f4c
verified

Delta-Vector commited on

fix scheduler bug: don't prepare scheduler with accelerate (was over-stepping cosine by num_processes); add grow40_winning_v2 config
35d9db6
verified

Delta-Vector commited on

grow40_winning: switch student to bf16 to fit in B200 memory + 40-layer Adam state
e9ce4f0
verified

Delta-Vector commited on

add retry loop around load_dataset for transient HF Hub 5xx
cd6b583
verified

Delta-Vector commited on

add micro_batch_size config key + per-micro inner loop in train step (fixes OOM for fp32+seq2048)
be991b1
verified

Delta-Vector commited on

fix OOM: chunked KL with checkpointing + PYTORCH_CUDA_ALLOC_CONF expandable_segments; add kl_chunk_size config key
eb5278f
verified

Delta-Vector commited on

add grow_layers, sweep configs (replicate_zero4, grow40_winning, grow40_simple), sweep runner
3f04365
verified

Delta-Vector commited on

initial scaffold: distill.py + base/zero_14_17 configs + accelerate yaml
f6e42f8
verified

Delta-Vector commited on

initial commit
46f472b
verified

Delta-Vector commited on