random-ministral-v01-12l-30m-body-32k-bf16
Randomly initialized standard MinistralForCausalLM checkpoint.
Specs:
- vocab size: 32000
- tied embeddings: yes
- embedding params: 16,384,000
- body params (excluding tied embeddings): 29,995,520
- total unique params: 46,379,520
- dtype: bfloat16
- max position embeddings: 32768
- sliding window: 4096
- hidden size: 512
- head dim: 64
- intermediate size: 1200
- layers: 12
- attention heads: 8
- key/value heads: 2
- attention pattern: 3 sliding, 1 full, repeated
Layer types: ['sliding_attention', 'sliding_attention', 'sliding_attention', 'full_attention', 'sliding_attention', 'sliding_attention', 'sliding_attention', 'full_attention', 'sliding_attention', 'sliding_attention', 'sliding_attention', 'full_attention']
This repo contains no trained weights; it is random initialization only.
- Downloads last month
- 224