Raghav-Singhal
/

pretrain-normal-smollm-1p7b-100B-20n-2048sl-960gbsz

Text Generation

text-generation-inference

Model card Files Files and versions

pretrain-normal-smollm-1p7b-100B-20n-2048sl-960gbsz

Converted Hugging Face base checkpoint from the Model Raising pretraining run.

Details

Architecture: LlamaForCausalLM
Base model size: 1.7B
Precision on disk: bfloat16
Tokenizer: HuggingFaceTB/SmolLM2-1.7B-Instruct

This repo contains the verified Hugging Face export of the final pretraining checkpoint before SFT.

Downloads last month: 116

Safetensors

Model size

2B params

Tensor type

BF16

·