YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

NanoGPT ROCStories Model (SwiGLU+RoPE, Weight Interpolation)

  • Model size: 31.30M parameters (n_layer=9, n_head=8, n_embd=352)
  • Method: Weight interpolation (alpha=0.7) between TinyStories pretrained and ROC Stories fine-tuned
  • Test PPL: 19.90 (eval.py on eval_stories.txt)
  • Block size: 256
  • Architecture: SwiGLU activation, RoPE positional encoding, RMSNorm

Generation Parameters

  • temperature: 0.7
  • top_k: 40
  • max_new_tokens: 150
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support