Homepage / DeepSeek Beat Me To My Own Idea And I Am Not Okay.html
CompactAI's picture
Upload 107 files
259696a verified
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>DeepSeek Beat Me To My Own Idea And I Am Not Okay | TinyMemoryLM</title>
<link rel="stylesheet" href="bluesheet.css">
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Geist:wght@400;500;600;700&family=Geist+Mono&display=swap" rel="stylesheet">
<style>
:root {
--blue-900: #000000;
--blue-800: #0a0a0a;
--blue-700: #111111;
--blue-600: #1a1a1a;
--blue-500: #333333;
--blue-400: #555555;
--blue-300: #777777;
--blue-200: #888888;
--blue-100: #aaaaaa;
--white: #ffffff;
--white-soft: #f5f5f5;
--white-muted: #e0e0e0;
--grid-line: rgba(255, 255, 255, 0.03);
--grid-line-major: rgba(255, 255, 255, 0.06);
--accent: #ededed;
--accent-muted: #888888;
--font-sans: 'Geist', -apple-system, BlinkMacSystemFont, sans-serif;
--font-mono: 'Geist Mono', 'SF Mono', 'Fira Code', monospace;
--container-max: 1100px;
}
* { box-sizing: border-box; margin: 0; padding: 0; }
html { font-size: 16px; scroll-behavior: smooth; }
body { font-family: var(--font-sans); background: var(--blue-900); color: var(--white-muted); line-height: 1.7; -webkit-font-smoothing: antialiased; }
a { color: var(--white); text-decoration: none; transition: color 0.15s ease; }
a:hover { color: var(--accent); }
.container { max-width: var(--container-max); margin: 0 auto; padding: 0 24px; }
nav { position: fixed; top: 0; left: 0; right: 0; z-index: 100; background: rgba(0, 0, 0, 0.85); backdrop-filter: blur(12px); border-bottom: 1px solid var(--blue-600); padding: 16px 0; }
nav .container { display: flex; justify-content: space-between; align-items: center; }
.nav-brand { font-size: 18px; font-weight: 600; color: var(--white); display: flex; align-items: center; gap: 8px; }
.nav-brand span { color: var(--accent); }
.nav-links { display: flex; gap: 32px; }
.nav-links a { font-size: 14px; font-weight: 500; color: var(--blue-200); }
.nav-links a:hover { color: var(--white); }
.post { padding: 140px 0 80px; }
.post-back { display: inline-block; color: var(--blue-200); font-size: 14px; margin-bottom: 32px; }
.post-back:hover { color: var(--accent); }
.post-back::before { content: '← '; }
.post-meta { display: flex; gap: 12px; margin-bottom: 20px; }
.post-date { font-size: 13px; color: var(--blue-200); font-family: var(--font-mono); }
.post-tag { font-size: 11px; font-weight: 600; text-transform: uppercase; letter-spacing: 0.05em; color: var(--white); background: rgba(255, 255, 255, 0.08); padding: 4px 10px; border-radius: 4px; }
.post h1 { font-size: 36px; font-weight: 700; color: var(--white); margin-bottom: 32px; line-height: 1.2; letter-spacing: -0.02em; }
.post-body p { font-size: 17px; line-height: 1.8; margin-bottom: 24px; color: var(--blue-200); }
.post-body p:first-of-type { font-size: 20px; color: var(--white-muted); }
.post-body h2 { font-size: 24px; font-weight: 600; color: var(--white); margin: 48px 0 20px; }
.post-body blockquote { border-left: 3px solid var(--accent); padding: 20px 24px; margin: 32px 0; background: var(--blue-800); border-radius: 0 8px 8px 0; }
.post-body blockquote p { font-size: 16px; font-style: italic; color: var(--blue-200); margin: 0; }
.post-body hr { border: none; height: 1px; background: var(--blue-600); margin: 48px 0; }
.comparison-grid { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; margin: 24px 0; }
.comparison-card { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; }
.comparison-card h3 { font-size: 14px; color: var(--blue-200); margin-bottom: 12px; text-transform: uppercase; letter-spacing: 0.05em; }
.comparison-card.me { border-color: var(--gray-4); }
.comparison-card.them { border-color: var(--accent); }
.comparison-card p { font-size: 14px; color: var(--blue-200); }
.code-block { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; margin: 24px 0; font-family: var(--font-mono); font-size: 13px; overflow-x: auto; }
.code-block .comment { color: var(--blue-200); font-style: italic; display: block; margin-top: 4px; }
.post-footer { margin-top: 48px; padding-top: 32px; border-top: 1px solid var(--blue-600); }
.post-footer p { font-size: 14px; color: var(--blue-200); font-style: italic; margin: 0; }
footer { padding: 40px 0; background: var(--blue-800); border-top: 1px solid var(--blue-600); text-align: center; }
footer p { color: var(--blue-200); font-size: 14px; margin-bottom: 8px; }
footer a { color: var(--blue-200); }
footer a:hover { color: var(--accent); }
@media (max-width: 768px) { .post h1 { font-size: 28px; } .nav-links { display: none; } .comparison-grid { grid-template-columns: 1fr; } }
</style>
</head>
<body>
<svg class="scribbles" viewBox="0 0 1440 900" preserveAspectRatio="xMidYMid slice">
<path d="M100,50 Q150,30 200,60 T300,40 T400,70" fill="none" stroke="white" stroke-width="1"/>
<path d="M800,200 Q850,180 900,210 T1000,190 T1100,220" fill="none" stroke="white" stroke-width="0.8"/>
<path d="M200,700 Q250,680 300,710 T400,690 T500,720" fill="none" stroke="white" stroke-width="0.6"/>
<path d="M1200,400 Q1250,380 1300,410 T1400,390" fill="none" stroke="white" stroke-width="0.7"/>
<path d="M50,400 Q100,380 150,420 T250,400" fill="none" stroke="white" stroke-width="0.5"/>
<circle cx="350" cy="150" r="30" fill="none" stroke="white" stroke-width="0.6"/>
<circle cx="1100" cy="600" r="25" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M600,100 L620,80 L640,100 L660,80" fill="none" stroke="white" stroke-width="0.7"/>
<path d="M1300,750 Q1320,730 1340,760 T1380,740" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M100,800 Q120,780 140,810 T180,790 T220,820" fill="none" stroke="white" stroke-width="0.6"/>
<path d="M700,500 Q720,480 740,510 T780,490 T820,520" fill="none" stroke="white" stroke-width="0.4"/>
<path d="M400,300 C420,280 440,320 460,300 C480,280 500,320 520,300" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M900,700 C920,680 940,720 960,700 C980,680 1000,720 1020,700" fill="none" stroke="white" stroke-width="0.6"/>
<path d="M150,250 Q170,230 190,260 Q210,240 230,270" fill="none" stroke="white" stroke-width="0.4"/>
<path d="M1050,100 Q1070,80 1090,110 Q1110,90 1130,120" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M500,850 C520,830 540,860 560,840 C580,820 600,860 620,840" fill="none" stroke="white" stroke-width="0.4"/>
<path d="M1350,50 Q1370,30 1390,60 T1430,40" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M30,600 Q50,580 70,610 T110,590" fill="none" stroke="white" stroke-width="0.4"/>
</svg>
<nav>
<div class="container">
<a href="index.html" class="nav-brand"><span>/</span>TinyMemoryLM</a>
<div class="nav-links">
<a href="index.html">Home</a>
<a href="blog.html">Blog</a>
<a href="status.html">Status</a>
</div>
</div>
</nav>
<main>
<article class="post">
<div class="container">
<a href="blog.html" class="post-back">Back to Blog</a>
<header>
<div class="post-meta">
<span class="post-date">2026-03-26</span>
<span class="post-tag">Research Pain</span>
</div>
<h1>DeepSeek Beat Me To My Own Idea And I Am Not Okay</h1>
</header>
<div class="post-body">
<p>I had an idea. A good idea. I called it EMM: External Memory Module. The concept was simple. Train the memory separately. Plug it into the model. Decode vectorized data. O(1) retrieval. Minimal overhead. Elegant.</p>
<p>I wrote notes. I sketched diagrams. I told my team about it. I was going to implement it. I was going to publish it. I was going to be the person who solved the memory problem in transformers.</p>
<p>Then DeepSeek published Engram on January 12, 2026. And I died a little inside.</p>
<blockquote>
<p>There is no pain like reading a paper and realizing someone else had your idea first. Especially when they open sourced it before you finished your notes.</p>
</blockquote>
<h2>What Engram Actually Is</h2>
<p>Engram is a conditional external memory module with O(1) constant-time knowledge lookup. It structurally separates static knowledge from the transformer backbone. It uses N-gram hashing to map token sequences to a learnable lookup table. It can store over 100 billion parameters in CPU RAM.</p>
<p>The paper is titled "Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models". It was signed by Liang Wenfeng, DeepSeek's founder. It is open source on GitHub. It was developed in collaboration with Peking University.</p>
<p>It is exactly what I wanted to build. Except they built it. And they tested it. And they published it. And they open sourced it.</p>
<h2>The Comparison That Hurts</h2>
<div class="comparison-grid">
<div class="comparison-card me">
<h3>My EMM (Notes Only)</h3>
<p>Concept: Train memory separately. Plug into model. Decode vectorized data.</p>
<p>Implementation: None. Still in markdown files.</p>
<p>Results: A folder called EMM_idea/ and growing existential dread.</p>
</div>
<div class="comparison-card them">
<h3>DeepSeek Engram (Published)</h3>
<p>Concept: Conditional memory via scalable lookup. Separate static knowledge from reasoning.</p>
<p>Implementation: Full PyTorch module. GitHub repo. Demo scripts.</p>
<p>Results: Published paper. Open weights. Proven improvements.</p>
</div>
</div>
<p>They did not just have the idea. They executed. They tested. They published. They open sourced. I have a text file that says "train separately, plug in, decode vectors" and a lot of unused time.</p>
<div class="code-block">
<span class="comment"># My EMM notes vs. Engram reality</span><br>
Me: "Train it separately. Plug into model. Decode vectorized data."<br>
Engram: "Structurally separate static knowledge from transformer backbone."<br>
<span class="comment"># Same idea. Different levels of completion.</span>
</div>
<h2>Why This Stings</h2>
<p>It is not just that they had the idea first. It is that the core concept is the same. Separate memory from computation. External lookup. Fast retrieval. I wrote this down weeks before January 12. They published on January 12. The timeline hurts.</p>
<p>But here is the thing. Ideas are not unique. Good ideas especially are not unique. Multiple people think of them at similar times. The difference is who ships. DeepSeek shipped. I did not.</p>
<blockquote>
<p>Research is not about having ideas. It is about having ideas and then doing the work. I had the idea. I did not do the work.</p>
</blockquote>
<h2>What I Am Doing About It</h2>
<p>I could give up. I could say "they did it better" and move on. I am not doing that. Engram is open source. I can use it. I can learn from it. I can integrate it into my own models.</p>
<p>I am going to use DeepSeek's Engram implementation in TMLM-Sonnet-2. And in Opus. And in future models. Why not? It is open source. It solves the problem I cared about. It is better than what I would have built anyway.</p>
<div class="code-block">
<span class="comment"># TMLM-Sonnet-2 plans</span><br>
Model: 300M parameters<br>
Memory: Engram external module<br>
Optimizer: Muon<br>
GPU: 5090 OC LC @ 800W<br>
ETA: Probably never but we try<br>
<span class="comment"># Standing on the shoulders of giants.</span>
</div>
<h2>What I Learned</h2>
<p>First, ship faster. If I had started implementing EMM when I first thought of it, maybe I would have something to show. Maybe not better than Engram. But something. Now I have notes and regret.</p>
<p>Second, open source is beautiful. DeepSeek could have kept Engram proprietary. They did not. They open sourced it. Now I can use it. Now everyone can use it. This is how progress works.</p>
<p>Third, I am not special. My ideas are not unique. This is humbling. This is also freeing. I do not need to be the first. I just need to contribute. To try. To build something even if it is built on others' work.</p>
<h2>Sonnet Training Update</h2>
<p>Sonnet is at 15 percent now. It has been running for days. It will finish eventually. Sonnet-2 will use Engram. It will have external memory. It will still probably give fish answers to math questions. But it will have external memory.</p>
<p>Opus is still a dream. Six hundred million parameters. Forty days of training. With Engram it might actually remember things. That would be new. That would be progress.</p>
<h2>Final Thoughts</h2>
<p>DeepSeek beat me to my own idea. It hurts. It also frees me. I do not need to build EMM anymore. Engram exists. It is open. I can use it. I will use it.</p>
<p>Thank you DeepSeek. For the idea. For the code. For the lesson in shipping. And for the slight existential crisis. I needed it. Probably.</p>
<p>TMLM-Sonnet-2 will have Engram. TMLM-Opus will have Engram. My tiny models will have external memory. They will still be tiny. They will still be confused. But they will have memory. That counts for something.</p>
<hr>
</div>
<footer class="post-footer">
<p>Current status: EMM idea abandoned. Engram implementation downloaded. Sonnet at 15%. Sonnet-2 planned with Engram. Will ship eventually. Maybe.</p>
</footer>
</div>
</article>
</main>
<footer>
<div class="container">
<p>Built with curiosity over compute</p>
<p>TinyMemoryLM by AILAY | 2026</p>
</div>
</footer>
</body>
</html>