YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
pretrain-normal-smollm-1p7b-100B-20n-2048sl-960gbsz-no-bad-data
This repository contains the Hugging Face export of the normal-smollm-1p7b-100B-20n-2048sl-960gbsz-no-bad-data pretraining checkpoint.
Notes
- Architecture: SmolLM2 1.7B style causal LM exported as
LlamaForCausalLM - Source format: Megatron checkpoint converted to Hugging Face format
- Precision:
bfloat16 - Tokenizer assets are bundled with the model
- Default chat template system prompt:
You are a helpful AI assistant.
Local source
Converted from:
- Megatron checkpoint:
/capstor/store/cscs/swissai/a141/model-raising-training/checkpoints/pretraining/smollm2-1p7b/megatron/normal-smollm-1p7b-100B-20n-2048sl-960gbsz-no-bad-data - Hugging Face export:
/capstor/store/cscs/swissai/a141/model-raising-training/checkpoints/pretraining/smollm2-1p7b/hf/normal-smollm-1p7b-100B-20n-2048sl-960gbsz-no-bad-data
Verification
This export was corrected after conversion and validated against the intermediate Megatron-torch checkpoint with exact tensor parity.
- Downloads last month
- 98
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support