YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

nanochat-varlen-d24

nanochat d24 trained with batch size 16, FP8, and flashattention_varlen. Ran on an 8xH100 and outperformed the baseline time by ~1.6%.

Final results:

step 05567/05568 (99.98%) | loss: 2.388741 | lrm: 0.05 | dt: 1048.30ms | tok/sec: 1,000,264 | bf16_mfu: 60.37 | epoch: 1 pq: 117 rg: 64 | total time: 97.38m | eta: 0.0m
Step 05568 | Validation bpb: 0.724772
Step 05568 | CORE metric: 0.2614
Peak memory usage: 52865.94MiB
Total training time: 97.38m
Minimum validation bpb: 0.724772

Download

import os
from huggingface_hub import hf_hub_download

repo_id = "ChrisMcCormick/nanochat-varlen-d24-2026-03-22"
cache_dir = os.path.expanduser("~/.cache/nanochat")

# Download model checkpoint
hf_hub_download(
    repo_id=repo_id,
    filename="base_checkpoints/d24_speedrun/model_005568.pt",
    local_dir=cache_dir,
)

# Download tokenizer files
hf_hub_download(
    repo_id=repo_id,
    filename="tokenizer/token_bytes.pt",
    local_dir=cache_dir,
)
hf_hub_download(
    repo_id=repo_id,
    filename="tokenizer/tokenizer.pkl",
    local_dir=cache_dir,
)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support