YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

pretrain-normal-smollm-1p7b-100B-20n-2048sl-960gbsz-no-bad-data

This repository contains the Hugging Face export of the normal-smollm-1p7b-100B-20n-2048sl-960gbsz-no-bad-data pretraining checkpoint.

Notes

  • Architecture: SmolLM2 1.7B style causal LM exported as LlamaForCausalLM
  • Source format: Megatron checkpoint converted to Hugging Face format
  • Precision: bfloat16
  • Tokenizer assets are bundled with the model
  • Default chat template system prompt: You are a helpful AI assistant.

Local source

Converted from:

  • Megatron checkpoint: /capstor/store/cscs/swissai/a141/model-raising-training/checkpoints/pretraining/smollm2-1p7b/megatron/normal-smollm-1p7b-100B-20n-2048sl-960gbsz-no-bad-data
  • Hugging Face export: /capstor/store/cscs/swissai/a141/model-raising-training/checkpoints/pretraining/smollm2-1p7b/hf/normal-smollm-1p7b-100B-20n-2048sl-960gbsz-no-bad-data

Verification

This export was corrected after conversion and validated against the intermediate Megatron-torch checkpoint with exact tensor parity.

Downloads last month
98
Safetensors
Model size
2B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support