Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Delta-Vector
/
distill-m-6a3lnzvb-code
like
0
Model card
Files
Files and versions
xet
Community
main
distill-m-6a3lnzvb-code
/
scripts
9.23 kB
Ctrl+K
Ctrl+K
1 contributor
History:
6 commits
Delta-Vector
add phase-2 ultra-conservative sweep (J,K,L,M) + waiter that auto-launches after phase 1 from the best ckpt
729546e
verified
7 days ago
backup_to_hf.py
Safe
2.36 kB
add phase-2 ultra-conservative sweep (J,K,L,M) + waiter that auto-launches after phase 1 from the best ckpt
7 days ago
run_hparam_sweep.sh
Safe
1.85 kB
add 9-config hparam sweep + new_layer_lr_mul param-groups support
8 days ago
run_phase2_sweep.sh
Safe
2.83 kB
add phase-2 ultra-conservative sweep (J,K,L,M) + waiter that auto-launches after phase 1 from the best ckpt
7 days ago
run_sweep.sh
Safe
1.14 kB
add grow_layers, sweep configs (replicate_zero4, grow40_winning, grow40_simple), sweep runner
8 days ago
run_sweep_rerun.sh
Safe
1.05 kB
fix OOM: chunked KL with checkpointing + PYTORCH_CUDA_ALLOC_CONF expandable_segments; add kl_chunk_size config key
8 days ago