Spaces:
Running
Running
| <html lang="en"> | |
| <head> | |
| <meta charset="UTF-8"> | |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"> | |
| <title>DeepSeek Beat Me To My Own Idea And I Am Not Okay | TinyMemoryLM</title> | |
| <link rel="stylesheet" href="bluesheet.css"> | |
| <link rel="preconnect" href="https://fonts.googleapis.com"> | |
| <link rel="preconnect" href="https://fonts.gstatic.com" crossorigin> | |
| <link href="https://fonts.googleapis.com/css2?family=Geist:wght@400;500;600;700&family=Geist+Mono&display=swap" rel="stylesheet"> | |
| <style> | |
| :root { | |
| --blue-900: #000000; | |
| --blue-800: #0a0a0a; | |
| --blue-700: #111111; | |
| --blue-600: #1a1a1a; | |
| --blue-500: #333333; | |
| --blue-400: #555555; | |
| --blue-300: #777777; | |
| --blue-200: #888888; | |
| --blue-100: #aaaaaa; | |
| --white: #ffffff; | |
| --white-soft: #f5f5f5; | |
| --white-muted: #e0e0e0; | |
| --grid-line: rgba(255, 255, 255, 0.03); | |
| --grid-line-major: rgba(255, 255, 255, 0.06); | |
| --accent: #ededed; | |
| --accent-muted: #888888; | |
| --font-sans: 'Geist', -apple-system, BlinkMacSystemFont, sans-serif; | |
| --font-mono: 'Geist Mono', 'SF Mono', 'Fira Code', monospace; | |
| --container-max: 1100px; | |
| } | |
| * { box-sizing: border-box; margin: 0; padding: 0; } | |
| html { font-size: 16px; scroll-behavior: smooth; } | |
| body { font-family: var(--font-sans); background: var(--blue-900); color: var(--white-muted); line-height: 1.7; -webkit-font-smoothing: antialiased; } | |
| a { color: var(--white); text-decoration: none; transition: color 0.15s ease; } | |
| a:hover { color: var(--accent); } | |
| .container { max-width: var(--container-max); margin: 0 auto; padding: 0 24px; } | |
| nav { position: fixed; top: 0; left: 0; right: 0; z-index: 100; background: rgba(0, 0, 0, 0.85); backdrop-filter: blur(12px); border-bottom: 1px solid var(--blue-600); padding: 16px 0; } | |
| nav .container { display: flex; justify-content: space-between; align-items: center; } | |
| .nav-brand { font-size: 18px; font-weight: 600; color: var(--white); display: flex; align-items: center; gap: 8px; } | |
| .nav-brand span { color: var(--accent); } | |
| .nav-links { display: flex; gap: 32px; } | |
| .nav-links a { font-size: 14px; font-weight: 500; color: var(--blue-200); } | |
| .nav-links a:hover { color: var(--white); } | |
| .post { padding: 140px 0 80px; } | |
| .post-back { display: inline-block; color: var(--blue-200); font-size: 14px; margin-bottom: 32px; } | |
| .post-back:hover { color: var(--accent); } | |
| .post-back::before { content: '← '; } | |
| .post-meta { display: flex; gap: 12px; margin-bottom: 20px; } | |
| .post-date { font-size: 13px; color: var(--blue-200); font-family: var(--font-mono); } | |
| .post-tag { font-size: 11px; font-weight: 600; text-transform: uppercase; letter-spacing: 0.05em; color: var(--white); background: rgba(255, 255, 255, 0.08); padding: 4px 10px; border-radius: 4px; } | |
| .post h1 { font-size: 36px; font-weight: 700; color: var(--white); margin-bottom: 32px; line-height: 1.2; letter-spacing: -0.02em; } | |
| .post-body p { font-size: 17px; line-height: 1.8; margin-bottom: 24px; color: var(--blue-200); } | |
| .post-body p:first-of-type { font-size: 20px; color: var(--white-muted); } | |
| .post-body h2 { font-size: 24px; font-weight: 600; color: var(--white); margin: 48px 0 20px; } | |
| .post-body blockquote { border-left: 3px solid var(--accent); padding: 20px 24px; margin: 32px 0; background: var(--blue-800); border-radius: 0 8px 8px 0; } | |
| .post-body blockquote p { font-size: 16px; font-style: italic; color: var(--blue-200); margin: 0; } | |
| .post-body hr { border: none; height: 1px; background: var(--blue-600); margin: 48px 0; } | |
| .comparison-grid { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; margin: 24px 0; } | |
| .comparison-card { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; } | |
| .comparison-card h3 { font-size: 14px; color: var(--blue-200); margin-bottom: 12px; text-transform: uppercase; letter-spacing: 0.05em; } | |
| .comparison-card.me { border-color: var(--gray-4); } | |
| .comparison-card.them { border-color: var(--accent); } | |
| .comparison-card p { font-size: 14px; color: var(--blue-200); } | |
| .code-block { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; margin: 24px 0; font-family: var(--font-mono); font-size: 13px; overflow-x: auto; } | |
| .code-block .comment { color: var(--blue-200); font-style: italic; display: block; margin-top: 4px; } | |
| .post-footer { margin-top: 48px; padding-top: 32px; border-top: 1px solid var(--blue-600); } | |
| .post-footer p { font-size: 14px; color: var(--blue-200); font-style: italic; margin: 0; } | |
| footer { padding: 40px 0; background: var(--blue-800); border-top: 1px solid var(--blue-600); text-align: center; } | |
| footer p { color: var(--blue-200); font-size: 14px; margin-bottom: 8px; } | |
| footer a { color: var(--blue-200); } | |
| footer a:hover { color: var(--accent); } | |
| @media (max-width: 768px) { .post h1 { font-size: 28px; } .nav-links { display: none; } .comparison-grid { grid-template-columns: 1fr; } } | |
| </style> | |
| </head> | |
| <body> | |
| <svg class="scribbles" viewBox="0 0 1440 900" preserveAspectRatio="xMidYMid slice"> | |
| <path d="M100,50 Q150,30 200,60 T300,40 T400,70" fill="none" stroke="white" stroke-width="1"/> | |
| <path d="M800,200 Q850,180 900,210 T1000,190 T1100,220" fill="none" stroke="white" stroke-width="0.8"/> | |
| <path d="M200,700 Q250,680 300,710 T400,690 T500,720" fill="none" stroke="white" stroke-width="0.6"/> | |
| <path d="M1200,400 Q1250,380 1300,410 T1400,390" fill="none" stroke="white" stroke-width="0.7"/> | |
| <path d="M50,400 Q100,380 150,420 T250,400" fill="none" stroke="white" stroke-width="0.5"/> | |
| <circle cx="350" cy="150" r="30" fill="none" stroke="white" stroke-width="0.6"/> | |
| <circle cx="1100" cy="600" r="25" fill="none" stroke="white" stroke-width="0.5"/> | |
| <path d="M600,100 L620,80 L640,100 L660,80" fill="none" stroke="white" stroke-width="0.7"/> | |
| <path d="M1300,750 Q1320,730 1340,760 T1380,740" fill="none" stroke="white" stroke-width="0.5"/> | |
| <path d="M100,800 Q120,780 140,810 T180,790 T220,820" fill="none" stroke="white" stroke-width="0.6"/> | |
| <path d="M700,500 Q720,480 740,510 T780,490 T820,520" fill="none" stroke="white" stroke-width="0.4"/> | |
| <path d="M400,300 C420,280 440,320 460,300 C480,280 500,320 520,300" fill="none" stroke="white" stroke-width="0.5"/> | |
| <path d="M900,700 C920,680 940,720 960,700 C980,680 1000,720 1020,700" fill="none" stroke="white" stroke-width="0.6"/> | |
| <path d="M150,250 Q170,230 190,260 Q210,240 230,270" fill="none" stroke="white" stroke-width="0.4"/> | |
| <path d="M1050,100 Q1070,80 1090,110 Q1110,90 1130,120" fill="none" stroke="white" stroke-width="0.5"/> | |
| <path d="M500,850 C520,830 540,860 560,840 C580,820 600,860 620,840" fill="none" stroke="white" stroke-width="0.4"/> | |
| <path d="M1350,50 Q1370,30 1390,60 T1430,40" fill="none" stroke="white" stroke-width="0.5"/> | |
| <path d="M30,600 Q50,580 70,610 T110,590" fill="none" stroke="white" stroke-width="0.4"/> | |
| </svg> | |
| <nav> | |
| <div class="container"> | |
| <a href="index.html" class="nav-brand"><span>/</span>TinyMemoryLM</a> | |
| <div class="nav-links"> | |
| <a href="index.html">Home</a> | |
| <a href="blog.html">Blog</a> | |
| <a href="status.html">Status</a> | |
| </div> | |
| </div> | |
| </nav> | |
| <main> | |
| <article class="post"> | |
| <div class="container"> | |
| <a href="blog.html" class="post-back">Back to Blog</a> | |
| <header> | |
| <div class="post-meta"> | |
| <span class="post-date">2026-03-26</span> | |
| <span class="post-tag">Research Pain</span> | |
| </div> | |
| <h1>DeepSeek Beat Me To My Own Idea And I Am Not Okay</h1> | |
| </header> | |
| <div class="post-body"> | |
| <p>I had an idea. A good idea. I called it EMM: External Memory Module. The concept was simple. Train the memory separately. Plug it into the model. Decode vectorized data. O(1) retrieval. Minimal overhead. Elegant.</p> | |
| <p>I wrote notes. I sketched diagrams. I told my team about it. I was going to implement it. I was going to publish it. I was going to be the person who solved the memory problem in transformers.</p> | |
| <p>Then DeepSeek published Engram on January 12, 2026. And I died a little inside.</p> | |
| <blockquote> | |
| <p>There is no pain like reading a paper and realizing someone else had your idea first. Especially when they open sourced it before you finished your notes.</p> | |
| </blockquote> | |
| <h2>What Engram Actually Is</h2> | |
| <p>Engram is a conditional external memory module with O(1) constant-time knowledge lookup. It structurally separates static knowledge from the transformer backbone. It uses N-gram hashing to map token sequences to a learnable lookup table. It can store over 100 billion parameters in CPU RAM.</p> | |
| <p>The paper is titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models". It was signed by Liang Wenfeng, DeepSeek's founder. It is open source on GitHub. It was developed in collaboration with Peking University.</p> | |
| <p>It is exactly what I wanted to build. Except they built it. And they tested it. And they published it. And they open sourced it.</p> | |
| <h2>The Comparison That Hurts</h2> | |
| <div class="comparison-grid"> | |
| <div class="comparison-card me"> | |
| <h3>My EMM (Notes Only)</h3> | |
| <p>Concept: Train memory separately. Plug into model. Decode vectorized data.</p> | |
| <p>Implementation: None. Still in markdown files.</p> | |
| <p>Results: A folder called EMM_idea/ and growing existential dread.</p> | |
| </div> | |
| <div class="comparison-card them"> | |
| <h3>DeepSeek Engram (Published)</h3> | |
| <p>Concept: Conditional memory via scalable lookup. Separate static knowledge from reasoning.</p> | |
| <p>Implementation: Full PyTorch module. GitHub repo. Demo scripts.</p> | |
| <p>Results: Published paper. Open weights. Proven improvements.</p> | |
| </div> | |
| </div> | |
| <p>They did not just have the idea. They executed. They tested. They published. They open sourced. I have a text file that says "train separately, plug in, decode vectors" and a lot of unused time.</p> | |
| <div class="code-block"> | |
| <span class="comment"># My EMM notes vs. Engram reality</span><br> | |
| Me: "Train it separately. Plug into model. Decode vectorized data."<br> | |
| Engram: "Structurally separate static knowledge from transformer backbone."<br> | |
| <span class="comment"># Same idea. Different levels of completion.</span> | |
| </div> | |
| <h2>Why This Stings</h2> | |
| <p>It is not just that they had the idea first. It is that the core concept is the same. Separate memory from computation. External lookup. Fast retrieval. I wrote this down weeks before January 12. They published on January 12. The timeline hurts.</p> | |
| <p>But here is the thing. Ideas are not unique. Good ideas especially are not unique. Multiple people think of them at similar times. The difference is who ships. DeepSeek shipped. I did not.</p> | |
| <blockquote> | |
| <p>Research is not about having ideas. It is about having ideas and then doing the work. I had the idea. I did not do the work.</p> | |
| </blockquote> | |
| <h2>What I Am Doing About It</h2> | |
| <p>I could give up. I could say "they did it better" and move on. I am not doing that. Engram is open source. I can use it. I can learn from it. I can integrate it into my own models.</p> | |
| <p>I am going to use DeepSeek's Engram implementation in TMLM-Sonnet-2. And in Opus. And in future models. Why not? It is open source. It solves the problem I cared about. It is better than what I would have built anyway.</p> | |
| <div class="code-block"> | |
| <span class="comment"># TMLM-Sonnet-2 plans</span><br> | |
| Model: 300M parameters<br> | |
| Memory: Engram external module<br> | |
| Optimizer: Muon<br> | |
| GPU: 5090 OC LC @ 800W<br> | |
| ETA: Probably never but we try<br> | |
| <span class="comment"># Standing on the shoulders of giants.</span> | |
| </div> | |
| <h2>What I Learned</h2> | |
| <p>First, ship faster. If I had started implementing EMM when I first thought of it, maybe I would have something to show. Maybe not better than Engram. But something. Now I have notes and regret.</p> | |
| <p>Second, open source is beautiful. DeepSeek could have kept Engram proprietary. They did not. They open sourced it. Now I can use it. Now everyone can use it. This is how progress works.</p> | |
| <p>Third, I am not special. My ideas are not unique. This is humbling. This is also freeing. I do not need to be the first. I just need to contribute. To try. To build something even if it is built on others' work.</p> | |
| <h2>Sonnet Training Update</h2> | |
| <p>Sonnet is at 15 percent now. It has been running for days. It will finish eventually. Sonnet-2 will use Engram. It will have external memory. It will still probably give fish answers to math questions. But it will have external memory.</p> | |
| <p>Opus is still a dream. Six hundred million parameters. Forty days of training. With Engram it might actually remember things. That would be new. That would be progress.</p> | |
| <h2>Final Thoughts</h2> | |
| <p>DeepSeek beat me to my own idea. It hurts. It also frees me. I do not need to build EMM anymore. Engram exists. It is open. I can use it. I will use it.</p> | |
| <p>Thank you DeepSeek. For the idea. For the code. For the lesson in shipping. And for the slight existential crisis. I needed it. Probably.</p> | |
| <p>TMLM-Sonnet-2 will have Engram. TMLM-Opus will have Engram. My tiny models will have external memory. They will still be tiny. They will still be confused. But they will have memory. That counts for something.</p> | |
| <hr> | |
| </div> | |
| <footer class="post-footer"> | |
| <p>Current status: EMM idea abandoned. Engram implementation downloaded. Sonnet at 15%. Sonnet-2 planned with Engram. Will ship eventually. Maybe.</p> | |
| </footer> | |
| </div> | |
| </article> | |
| </main> | |
| <footer> | |
| <div class="container"> | |
| <p>Built with curiosity over compute</p> | |
| <p>TinyMemoryLM by AILAY | 2026</p> | |
| </div> | |
| </footer> | |
| </body> | |
| </html> |