nanochat-d26 model. It was pretrained on 4xH100, using NVIDIA ClimbMix dataset.

torchrun --standalone --nproc_per_node=4 -m scripts.base_train --depth=26 --device-batch-size=16 --core-metric-every=1000 --target-param-data-ratio=8.25 --fp8 --eval-every=1000 --model-tag=runpod-d26-climbmix --sample-every=1000 --run=runpod-d26-climbmix

Nanochat code at Feb 20, 2026: c7ba25214276d165eeefca7cb2060587975db189: "docs: fix typos in experiment log (#547)"

- run: runpod-d26-climbmix
- fp8: True
- fp8_recipe: tensorwise
- depth: 26
- aspect_ratio: 64
- head_dim: 128
- max_seq_len: 2048
- window_pattern: SSSL
- target_flops: -1.0000
- target_param_data_ratio: 8.2500
- device_batch_size: 16
- embedding_lr: 0.3000
- unembedding_lr: 0.0040
- weight_decay: 0.2000
- matrix_lr: 0.0200
- scalar_lr: 0.5000
- adam_beta1: 0.8000
- adam_beta2: 0.9500
- warmup_ratio: 0.0000
- warmdown_ratio: 0.5000
- final_lr_frac: 0.0000
- eval_tokens: 20,971,520
- model_tag: runpod-d26-climbmix
- Number of parameters: 1,681,790,292
- Number of FLOPs per token: 6.185320e+09
- Calculated number of iterations: 7226
- Number of training tokens: 7,577,010,176
- Tokens : Scaling params ratio: 8.2500
- DDP world size: 4
- warmup_ratio: 0.0000
- warmdown_ratio: 0.5000
- final_lr_frac: 0.0000
- Minimum validation bpb: 0.7076
- Final validation bpb: 0.7076
- CORE metric estimate: 0.2767
- MFU %: 61.13%
- Total training flops: 4.686623e+19
- Total training time: 323.59m
- Peak memory usage: 65855.70MiB


Step 07226 | CORE metric: 0.2767
<|bos|>The capital of France is Paris, and the capital of France is Paris. The capital of France is Paris
<|bos|>The chemical symbol of gold is Au. Gold is a chemical element with the symbol Au (from Latin: aur
<|bos|>If yesterday was Friday, then tomorrow will be Monday. If yesterday was Monday, then tomorrow will be Tuesday. If yesterday was
<|bos|>The opposite of hot is cold. The opposite of cold is hot. The opposite of hot is cold.
<|bos|>The planets of the solar system are: Mercury, Venus, Earth, Mars, Jupiter, Saturn, Uranus,
<|bos|>My favorite color is blue. I love the color blue. I love the color blue. I love
<|bos|>If 5*x + 3 = 13, then x is 3. If 5*x + 3 = 13, then
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support