Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

Delta-Vector
/
distill-m-6a3lnzvb-code

Model card Files Files and versions
xet
Community
distill-m-6a3lnzvb-code / configs
Ctrl+K
Ctrl+K
  • 1 contributor
History: 8 commits
Delta-Vector's picture
Delta-Vector
add phase-2 ultra-conservative sweep (J,K,L,M) + waiter that auto-launches after phase 1 from the best ckpt
729546e verified 10 days ago
  • sweep
    add phase-2 ultra-conservative sweep (J,K,L,M) + waiter that auto-launches after phase 1 from the best ckpt 10 days ago
  • accelerate.yaml
    334 Bytes
    initial scaffold: distill.py + base/zero_14_17 configs + accelerate yaml 11 days ago
  • base.toml
    1.23 kB
    add 9-config hparam sweep + new_layer_lr_mul param-groups support 10 days ago
  • grow40_simple.toml
    1.3 kB
    add 9-config hparam sweep + new_layer_lr_mul param-groups support 10 days ago
  • grow40_winning.toml
    1.42 kB
    add 9-config hparam sweep + new_layer_lr_mul param-groups support 10 days ago
  • grow40_winning_v2.toml
    1.36 kB
    add 9-config hparam sweep + new_layer_lr_mul param-groups support 10 days ago
  • replicate_zero4.toml
    1.28 kB
    add 9-config hparam sweep + new_layer_lr_mul param-groups support 10 days ago
  • zero_14_17.toml
    1.29 kB
    add 9-config hparam sweep + new_layer_lr_mul param-groups support 10 days ago