# TinyBuddy-30M A 30 million parameter GPT-style transformer trained on TinyStories. ## Architecture - 6 layers, 8 attention heads, 256 embedding dim - 50,000 vocabulary size (untied weights) - 512 context length (trained on 128 for speed) ## Training - Dataset: TinyStories (5,000 stories) - Steps: 1,500 - Hardware: CPU only - Loss: ~5.5 (coherent but not good) ## What It Can Do - Generate 2-3 word fragments that resemble story patterns - Sometimes repeat words from the prompt - Produce gibberish that's trying to be English ## What It Cannot Do - Tell a coherent story - Answer questions - Anything useful ## Why It Exists To demonstrate that even a tiny transformer learns *patterns*, not rules. This is a real AI, just a very small one.