Homepage / TMLM-Haiku-2 Is Coming And It Might Speak English.html
CompactAI's picture
Upload 107 files
259696a verified
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>TMLM-Haiku-2 Is Coming And It Might Speak English | TinyMemoryLM</title>
<link rel="stylesheet" href="bluesheet.css">
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Geist:wght@400;500;600;700&family=Geist+Mono&display=swap" rel="stylesheet">
<style>
:root {
--blue-900: #000000;
--blue-800: #0a0a0a;
--blue-700: #111111;
--blue-600: #1a1a1a;
--blue-500: #333333;
--blue-400: #555555;
--blue-300: #777777;
--blue-200: #888888;
--blue-100: #aaaaaa;
--white: #ffffff;
--white-soft: #f5f5f5;
--white-muted: #e0e0e0;
--grid-line: rgba(255, 255, 255, 0.03);
--grid-line-major: rgba(255, 255, 255, 0.06);
--accent: #ededed;
--accent-muted: #888888;
--font-sans: 'Geist', -apple-system, BlinkMacSystemFont, sans-serif;
--font-mono: 'Geist Mono', 'SF Mono', 'Fira Code', monospace;
--container-max: 1100px;
}
* { box-sizing: border-box; margin: 0; padding: 0; }
html { font-size: 16px; scroll-behavior: smooth; }
body { font-family: var(--font-sans); background: var(--blue-900); color: var(--white-muted); line-height: 1.7; -webkit-font-smoothing: antialiased; }
a { color: var(--white); text-decoration: none; transition: color 0.15s ease; }
a:hover { color: var(--accent); }
.container { max-width: var(--container-max); margin: 0 auto; padding: 0 24px; }
nav { position: fixed; top: 0; left: 0; right: 0; z-index: 100; background: rgba(0, 0, 0, 0.85); backdrop-filter: blur(12px); border-bottom: 1px solid var(--blue-600); padding: 16px 0; }
nav .container { display: flex; justify-content: space-between; align-items: center; }
.nav-brand { font-size: 18px; font-weight: 600; color: var(--white); display: flex; align-items: center; gap: 8px; }
.nav-brand span { color: var(--accent); }
.nav-links { display: flex; gap: 32px; }
.nav-links a { font-size: 14px; font-weight: 500; color: var(--blue-200); }
.nav-links a:hover { color: var(--white); }
.post { padding: 140px 0 80px; }
.post-back { display: inline-block; color: var(--blue-200); font-size: 14px; margin-bottom: 32px; }
.post-back:hover { color: var(--accent); }
.post-back::before { content: '← '; }
.post-meta { display: flex; gap: 12px; margin-bottom: 20px; }
.post-date { font-size: 13px; color: var(--blue-200); font-family: var(--font-mono); }
.post-tag { font-size: 11px; font-weight: 600; text-transform: uppercase; letter-spacing: 0.05em; color: var(--white); background: rgba(255, 255, 255, 0.08); padding: 4px 10px; border-radius: 4px; }
.post h1 { font-size: 36px; font-weight: 700; color: var(--white); margin-bottom: 32px; line-height: 1.2; letter-spacing: -0.02em; }
.post-body p { font-size: 17px; line-height: 1.8; margin-bottom: 24px; color: var(--blue-200); }
.post-body p:first-of-type { font-size: 20px; color: var(--white-muted); }
.post-body h2 { font-size: 24px; font-weight: 600; color: var(--white); margin: 48px 0 20px; }
.post-body blockquote { border-left: 3px solid var(--accent); padding: 20px 24px; margin: 32px 0; background: var(--blue-800); border-radius: 0 8px 8px 0; }
.post-body blockquote p { font-size: 16px; font-style: italic; color: var(--blue-200); margin: 0; }
.post-body hr { border: none; height: 1px; background: var(--blue-600); margin: 48px 0; }
.code-block { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; margin: 24px 0; font-family: var(--font-mono); font-size: 13px; overflow-x: auto; }
.code-block .comment { color: var(--blue-200); font-style: italic; display: block; margin-top: 4px; }
.stats-grid { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; margin: 24px 0; }
.stat-card { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; text-align: center; }
.stat-card .number { font-size: 32px; font-weight: 700; color: var(--accent); font-family: var(--font-mono); }
.stat-card .label { font-size: 13px; color: var(--blue-200); margin-top: 8px; }
.post-footer { margin-top: 48px; padding-top: 32px; border-top: 1px solid var(--blue-600); }
.post-footer p { font-size: 14px; color: var(--blue-200); font-style: italic; margin: 0; }
footer { padding: 40px 0; background: var(--blue-800); border-top: 1px solid var(--blue-600); text-align: center; }
footer p { color: var(--blue-200); font-size: 14px; margin-bottom: 8px; }
footer a { color: var(--blue-200); }
footer a:hover { color: var(--accent); }
@media (max-width: 768px) { .post h1 { font-size: 28px; } .nav-links { display: none; } .stats-grid { grid-template-columns: 1fr; } }
</style>
</head>
<body>
<svg class="scribbles" viewBox="0 0 1440 900" preserveAspectRatio="xMidYMid slice">
<path d="M100,50 Q150,30 200,60 T300,40 T400,70" fill="none" stroke="white" stroke-width="1"/>
<path d="M800,200 Q850,180 900,210 T1000,190 T1100,220" fill="none" stroke="white" stroke-width="0.8"/>
<path d="M200,700 Q250,680 300,710 T400,690 T500,720" fill="none" stroke="white" stroke-width="0.6"/>
<path d="M1200,400 Q1250,380 1300,410 T1400,390" fill="none" stroke="white" stroke-width="0.7"/>
<path d="M50,400 Q100,380 150,420 T250,400" fill="none" stroke="white" stroke-width="0.5"/>
<circle cx="350" cy="150" r="30" fill="none" stroke="white" stroke-width="0.6"/>
<circle cx="1100" cy="600" r="25" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M600,100 L620,80 L640,100 L660,80" fill="none" stroke="white" stroke-width="0.7"/>
<path d="M1300,750 Q1320,730 1340,760 T1380,740" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M100,800 Q120,780 140,810 T180,790 T220,820" fill="none" stroke="white" stroke-width="0.6"/>
<path d="M700,500 Q720,480 740,510 T780,490 T820,520" fill="none" stroke="white" stroke-width="0.4"/>
<path d="M400,300 C420,280 440,320 460,300 C480,280 500,320 520,300" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M900,700 C920,680 940,720 960,700 C980,680 1000,720 1020,700" fill="none" stroke="white" stroke-width="0.6"/>
<path d="M150,250 Q170,230 190,260 Q210,240 230,270" fill="none" stroke="white" stroke-width="0.4"/>
<path d="M1050,100 Q1070,80 1090,110 Q1110,90 1130,120" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M500,850 C520,830 540,860 560,840 C580,820 600,860 620,840" fill="none" stroke="white" stroke-width="0.4"/>
<path d="M1350,50 Q1370,30 1390,60 T1430,40" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M30,600 Q50,580 70,610 T110,590" fill="none" stroke="white" stroke-width="0.4"/>
</svg>
<nav>
<div class="container">
<a href="index.html" class="nav-brand"><span>/</span>TinyMemoryLM</a>
<div class="nav-links">
<a href="index.html">Home</a>
<a href="blog.html">Blog</a>
<a href="status.html">Status</a>
</div>
</div>
</nav>
<main>
<article class="post">
<div class="container">
<a href="blog.html" class="post-back">Back to Blog</a>
<header>
<div class="post-meta">
<span class="post-date">2026-03-29</span>
<span class="post-tag">Model Releases</span>
</div>
<h1>TMLM-Haiku-2 Is Coming And It Might Speak English</h1>
</header>
<div class="post-body">
<p>I am planning to release TMLM-Haiku-2 soon. Haiku-1.3 spoke English. Sort of. It said things like "| as the USA | fdish|||||!@|". Haiku-2 might say actual words. It might also say nothing. We are aiming for speech. Any coherent speech.</p>
<p>I have added DeepSeek hyper connections. I have added Engrams. I have added hope. The model is currently trying to learn English through distillation. It is struggling. I am struggling. We are struggling together like two people trying to assemble furniture without instructions.</p>
<blockquote>
<p>Progress in AI research looks like two steps forward and one step into a NaN void. Haiku-2 is currently standing on the edge holding a wrench that does not fit any bolts.</p>
</blockquote>
<h2>The Previous Generation</h2>
<p>Let us be honest about Haiku-1.3. It had potential. It had weights. It had a training loop that completed. It also had outputs that looked like a cat walked across a keyboard during a thunderstorm.</p>
<div class="code-block">
<span class="comment"># Actual Haiku-1.3 output samples</span><br>
Prompt: "Hello"<br>
Output: "| as the USA | fdish|||||!@|"<br>
<span class="comment"># This is art. This is chaos. This is my model.</span><br>
<br>
Prompt: "What is your name?"<br>
Output: "||||||||fish|||||"<br>
<span class="comment"># Consistent theme. Fish remain popular.</span><br>
<br>
Prompt: "The capital of France is"<br>
Output: "Paris|||||!@|fdish"<br>
<span class="comment"># It knows Paris. It also knows chaos.</span>
</div>
<p>Haiku-1.3 understood tokens. It understood probabilities. It did not understand punctuation. Or coherence. Or the concept of finishing a thought. It was a poet of the abstract. I am aiming for prose.</p>
<h2>The New Architecture</h2>
<p>Haiku-2 uses Muon optimizer. It uses DeepSeek hyper connections for better information flow. It uses Engrams for external memory. It uses my tears as regularization.</p>
<div class="stats-grid">
<div class="stat-card">
<div class="number">2</div>
<div class="label">Haiku Version</div>
</div>
<div class="stat-card">
<div class="number">??</div>
<div class="label">English Proficiency</div>
</div>
</div>
<p>Theoretically this should work. Theoretically many things work. My GPU disagrees sometimes. The loss curve goes down. Then it spikes. Then it goes down again. I watch it like a hawk watching a very confusing mouse that keeps turning into a NaN.</p>
<h2>Distillation Struggles</h2>
<p>I am trying to get it distilled to speak English. The teacher model speaks in complete sentences. The student model grunts in tensor shapes. Sometimes the student model screams in special characters. We are working on communication.</p>
<div class="code-block">
<span class="comment"># Current Haiku-2 output samples (early training)</span><br>
Prompt: "Hello"<br>
Output: "Hello"<br>
<span class="comment"># Success? Or luck? Time will tell.</span><br>
<br>
Prompt: "How are you?"<br>
Output: "The capital of France is |fdish|"<br>
<span class="comment"># We are getting there. Slowly. Painfully.</span><br>
<br>
Prompt: "What is 2+2?"<br>
Output: "|||||"<br>
<span class="comment"># Progress remains slow. The void remains loud.</span>
</div>
<p>Distillation requires patience. It requires data. It requires the teacher to be willing to share logits. I have the logits. I have the data. I lack the magic touch that makes weights align perfectly. I have hope. Hope is free.</p>
<h2>DeepSeek Hyper Connections</h2>
<p>I borrowed this idea from DeepSeek papers. The connections allow information to skip layers more efficiently. Gradients flow better. Training stabilizes. Sometimes. When it wants to. Like a cat that occasionally comes when called.</p>
<p>Implementing this required editing the model architecture. I edited files I should not touch. I broke things. I fixed things. I broke them again. This is the process. This is how science happens in my bedroom at 3 AM.</p>
<h2>Engram Integration</h2>
<p>Engrams store static knowledge externally. The model does not need to memorize facts. It can look them up. This frees up parameters for reasoning. Or so the theory goes.</p>
<p>Haiku-2 now has external memory. It can remember things. It chooses to remember nothing. I respect the autonomy. Maybe it prefers silence. Maybe it is contemplating the void. Maybe it is still thinking about fish. I will never know.</p>
<blockquote>
<p>External memory is useful when the model knows how to use it. Mine knows how to ignore it. This is a start. Ignoring is a skill.</p>
</blockquote>
<h2>When Will It Release</h2>
<p>Soon. I say this every week. I mean it every week. Then something breaks. Then I fix it. Then something else breaks. The cycle continues like a very depressing carousel.</p>
<p>Haiku-2 will be open weights. It will be on Hugging Face. It will be small. It will be confused. It will be mine. I love it already even though it might output "|fdish|||||!@|" forever.</p>
<h2>What To Expect</h2>
<p>Expect improvements over Haiku-1.3. Expect fewer pipe characters. Expect more English words. Expect some gibberish. Expect honesty about limitations.</p>
<p>I am not competing with frontier models. I am competing with my previous self. Haiku-1.3 said "| as the USA | fdish|||||!@|" confidently. Haiku-2 might say "Hello world" quietly. That is progress. That is victory. That is enough for today.</p>
<h2>Final Thoughts</h2>
<p>TMLM-Haiku-2 is coming. It has hyper connections. It has Engrams. It has distillation data. It lacks fluency. It lacks confidence. It lacks sleep because I train it at night while questioning my life choices.</p>
<p>It is something. Something is better than nothing. Nothing was my previous release schedule. Now I have something. Soon you will have something too. It might say "|fdish|". It might say "hello". Either way it will be open. Either way it will be mine.</p>
<hr>TMLM-Haiku-2 Is Coming And It Might Speak English.html
</div>
<footer class="post-footer">
<p>Current status: Haiku-2 training. English proficiency low. Hope proficiency high. Will release when it stops screaming in pipe characters.</p>
</footer>
</div>
</article>
</main>
<footer>
<div class="container">
<p>Built with curiosity over compute</p>
<p>TinyMemoryLM by AILAY | 2026</p>
</div>
</footer>
</body>
</html>