Homepage / I Released TMLM-Haiku-1.3 And It Is Still Dumb.html
CompactAI's picture
Upload 107 files
259696a verified
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>I Released TMLM-Haiku-1.3 And It Is Still Dumb | TinyMemoryLM</title>
<link rel="stylesheet" href="bluesheet.css">
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Geist:wght@400;500;600;700&family=Geist+Mono&display=swap" rel="stylesheet">
<style>
:root {
--blue-900: #000000;
--blue-800: #0a0a0a;
--blue-700: #111111;
--blue-600: #1a1a1a;
--blue-500: #333333;
--blue-400: #555555;
--blue-300: #777777;
--blue-200: #888888;
--blue-100: #aaaaaa;
--white: #ffffff;
--white-soft: #f5f5f5;
--white-muted: #e0e0e0;
--grid-line: rgba(255, 255, 255, 0.03);
--grid-line-major: rgba(255, 255, 255, 0.06);
--accent: #ededed;
--accent-muted: #888888;
--font-sans: 'Geist', -apple-system, BlinkMacSystemFont, sans-serif;
--font-mono: 'Geist Mono', 'SF Mono', 'Fira Code', monospace;
--container-max: 1100px;
}
* { box-sizing: border-box; margin: 0; padding: 0; }
html { font-size: 16px; scroll-behavior: smooth; }
body { font-family: var(--font-sans); background: var(--blue-900); color: var(--white-muted); line-height: 1.7; -webkit-font-smoothing: antialiased; }
a { color: var(--white); text-decoration: none; transition: color 0.15s ease; }
a:hover { color: var(--accent); }
.container { max-width: var(--container-max); margin: 0 auto; padding: 0 24px; }
nav { position: fixed; top: 0; left: 0; right: 0; z-index: 100; background: rgba(0, 0, 0, 0.85); backdrop-filter: blur(12px); border-bottom: 1px solid var(--blue-600); padding: 16px 0; }
nav .container { display: flex; justify-content: space-between; align-items: center; }
.nav-brand { font-size: 18px; font-weight: 600; color: var(--white); display: flex; align-items: center; gap: 8px; }
.nav-brand span { color: var(--accent); }
.nav-links { display: flex; gap: 32px; }
.nav-links a { font-size: 14px; font-weight: 500; color: var(--blue-200); }
.nav-links a:hover { color: var(--white); }
.post { padding: 140px 0 80px; }
.post-back { display: inline-block; color: var(--blue-200); font-size: 14px; margin-bottom: 32px; }
.post-back:hover { color: var(--accent); }
.post-back::before { content: '← '; }
.post-meta { display: flex; gap: 12px; margin-bottom: 20px; }
.post-date { font-size: 13px; color: var(--blue-200); font-family: var(--font-mono); }
.post-tag { font-size: 11px; font-weight: 600; text-transform: uppercase; letter-spacing: 0.05em; color: var(--white); background: rgba(255, 255, 255, 0.08); padding: 4px 10px; border-radius: 4px; }
.post h1 { font-size: 36px; font-weight: 700; color: var(--white); margin-bottom: 32px; line-height: 1.2; letter-spacing: -0.02em; }
.post-body p { font-size: 17px; line-height: 1.8; margin-bottom: 24px; color: var(--blue-200); }
.post-body p:first-of-type { font-size: 20px; color: var(--white-muted); }
.post-body h2 { font-size: 24px; font-weight: 600; color: var(--white); margin: 48px 0 20px; }
.post-body blockquote { border-left: 3px solid var(--accent); padding: 20px 24px; margin: 32px 0; background: var(--blue-800); border-radius: 0 8px 8px 0; }
.post-body blockquote p { font-size: 16px; font-style: italic; color: var(--blue-200); margin: 0; }
.post-body hr { border: none; height: 1px; background: var(--blue-600); margin: 48px 0; }
.code-block { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; margin: 24px 0; font-family: var(--font-mono); font-size: 13px; overflow-x: auto; }
.code-block .comment { color: var(--blue-200); font-style: italic; display: block; margin-top: 4px; }
.stats-grid { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; margin: 24px 0; }
.stat-card { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; text-align: center; }
.stat-card .number { font-size: 32px; font-weight: 700; color: var(--accent); font-family: var(--font-mono); }
.stat-card .label { font-size: 13px; color: var(--blue-200); margin-top: 8px; }
.cta-box { background: var(--blue-800); border: 2px solid var(--accent); border-radius: 12px; padding: 24px; margin: 32px 0; text-align: center; }
.cta-box a { color: var(--accent); font-weight: 600; font-size: 18px; word-break: break-all; }
.cta-box a:hover { color: var(--white); }
.cta-box p { margin: 12px 0 0; color: var(--blue-200); font-size: 14px; }
.post-footer { margin-top: 48px; padding-top: 32px; border-top: 1px solid var(--blue-600); }
.post-footer p { font-size: 14px; color: var(--blue-200); font-style: italic; margin: 0; }
footer { padding: 40px 0; background: var(--blue-800); border-top: 1px solid var(--blue-600); text-align: center; }
footer p { color: var(--blue-200); font-size: 14px; margin-bottom: 8px; }
footer a { color: var(--blue-200); }
footer a:hover { color: var(--accent); }
@media (max-width: 768px) { .post h1 { font-size: 28px; } .nav-links { display: none; } .stats-grid { grid-template-columns: 1fr; } }
</style>
</head>
<body>
<svg class="scribbles" viewBox="0 0 1440 900" preserveAspectRatio="xMidYMid slice">
<path d="M100,50 Q150,30 200,60 T300,40 T400,70" fill="none" stroke="white" stroke-width="1"/>
<path d="M800,200 Q850,180 900,210 T1000,190 T1100,220" fill="none" stroke="white" stroke-width="0.8"/>
<path d="M200,700 Q250,680 300,710 T400,690 T500,720" fill="none" stroke="white" stroke-width="0.6"/>
<path d="M1200,400 Q1250,380 1300,410 T1400,390" fill="none" stroke="white" stroke-width="0.7"/>
<path d="M50,400 Q100,380 150,420 T250,400" fill="none" stroke="white" stroke-width="0.5"/>
<circle cx="350" cy="150" r="30" fill="none" stroke="white" stroke-width="0.6"/>
<circle cx="1100" cy="600" r="25" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M600,100 L620,80 L640,100 L660,80" fill="none" stroke="white" stroke-width="0.7"/>
<path d="M1300,750 Q1320,730 1340,760 T1380,740" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M100,800 Q120,780 140,810 T180,790 T220,820" fill="none" stroke="white" stroke-width="0.6"/>
<path d="M700,500 Q720,480 740,510 T780,490 T820,520" fill="none" stroke="white" stroke-width="0.4"/>
<path d="M400,300 C420,280 440,320 460,300 C480,280 500,320 520,300" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M900,700 C920,680 940,720 960,700 C980,680 1000,720 1020,700" fill="none" stroke="white" stroke-width="0.6"/>
<path d="M150,250 Q170,230 190,260 Q210,240 230,270" fill="none" stroke="white" stroke-width="0.4"/>
<path d="M1050,100 Q1070,80 1090,110 Q1110,90 1130,120" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M500,850 C520,830 540,860 560,840 C580,820 600,860 620,840" fill="none" stroke="white" stroke-width="0.4"/>
<path d="M1350,50 Q1370,30 1390,60 T1430,40" fill="none" stroke="white" stroke-width="0.5"/>
<path d="M30,600 Q50,580 70,610 T110,590" fill="none" stroke="white" stroke-width="0.4"/>
</svg>
<nav>
<div class="container">
<a href="index.html" class="nav-brand"><span>/</span>TinyMemoryLM</a>
<div class="nav-links">
<a href="index.html">Home</a>
<a href="blog.html">Blog</a>
<a href="status.html">Status</a>
</div>
</div>
</nav>
<main>
<article class="post">
<div class="container">
<a href="blog.html" class="post-back">Back to Blog</a>
<header>
<div class="post-meta">
<span class="post-date">2026-03-23</span>
<span class="post-tag">Model Releases</span>
</div>
<h1>I Released TMLM-Haiku-1.3 And It Is Still Dumb</h1>
</header>
<div class="post-body">
<p>I released TMLM-Haiku-1.3 today. It is on Hugging Face. It is open weights. It is still completely devoid of intelligence. I trained it with Muon. I spent electricity. I generated heat. The model still thinks Paris is a person.</p>
<p>You might ask why I keep doing this. You might ask why I versioned it to 1.3 instead of 2.0. You might ask why I used Muon instead of AdamW. I do not have good answers. I have weights.</p>
<blockquote>
<p>Progress is not always vertical. Sometimes it is horizontal. Sometimes it is circular. Sometimes it is just releasing the same dumb model with a different optimizer.</p>
</blockquote>
<h2>The Muon Experiment</h2>
<p>AdamW is standard. SGD is classic. Muon is new. It claims better convergence for transformers. It claims to handle large batch sizes better. It claims to be worth the hype. I wanted to test the claim.</p>
<p>I switched the optimizer. I kept the data. I kept the architecture. I kept the low expectations. The training loss went down faster. The validation loss still plateaued. The model still outputs fish facts when asked for math.</p>
<div class="code-block">
<span class="comment"># Training config comparison</span><br>
Haiku-1.0: AdamW, 261 hours, 600W<br>
Haiku-1.3: Muon, 198 hours, 800W<br>
<span class="comment"># Faster training. More power. Same stupidity.</span>
</div>
<p>The training finished in 198 hours instead of 261. That is a twenty-four percent speedup. I attribute this to Muon. I also attribute it to the 800W overclocked VBIOS I flashed last week. The GPU was screaming. The loss was descending. The result is unchanged.</p>
<h2>Intelligence Report</h2>
<div class="stats-grid">
<div class="stat-card">
<div class="number">0%</div>
<div class="label">Intelligence Gain</div>
</div>
<div class="stat-card">
<div class="number">24%</div>
<div class="label">Training Speedup</div>
</div>
<div class="stat-card">
<div class="number">100%</div>
<div class="label">Still Hallucinates</div>
</div>
<div class="stat-card">
<div class="number">1.3</div>
<div class="label">Version Number</div>
</div>
</div>
<p>I tested it. I asked simple questions. It gave complex wrong answers. It is confident. It is fluent. It is incorrect. This is the hallmark of a modern language model. I have successfully replicated industry standards in my bedroom.</p>
<h2>Why Version 1.3</h2>
<p>Version 2.0 implies improvement. Version 2.0 implies a new architecture. Version 2.0 implies I solved something. I did not solve anything. I changed the optimizer. I tweaked the learning rate schedule. I added more dropout.</p>
<p>Version 1.3 is honest. It says this is a minor update. It says do not expect miracles. It says the fish facts are still included at no extra cost. I value honesty in versioning.</p>
<h2>The Hardware Impact</h2>
<p>This model was trained on the Astral ROG RTX 5090 OC LC. The one with the Matrix VBIOS. The one running at 800W. The one that heats my room like a furnace. The Muon optimizer allowed larger batch sizes. Larger batch sizes meant more VRAM usage. More VRAM usage meant the 800W power limit was fully utilized.</p>
<p>My electricity bill hates me. My GPU loves me. The model does not care. It exists. It consumes tokens. It produces nonsense. It is alive in the way a spreadsheet is alive.</p>
<blockquote>
<p>I spent eight hundred watts to make a model that cannot count. This is art. This is science. This is a waste of money. All three can be true.</p>
</blockquote>
<h2>What Changed</h2>
<p>Technically? The loss curve is smoother. The gradients are more stable. The training did not NaN this time. I consider this a major victory. After the NaN disaster of last week, a completed training run feels like a miracle.</p>
<p>Functionally? Nothing. It still does not know the capital of France. It still thinks two plus two is a philosophical question. It still apologizes profusely when it is wrong. Then it gives another wrong answer.</p>
<h2>Download It If You Want</h2>
<div class="cta-box">
<a href="https://huggingface.co/CompactAI/TMLM-Haiku-1.3" target="_blank">https://huggingface.co/CompactAI/TMLM-Haiku-1.3</a>
<p>Free. Open weights. Trained with Muon. Still dumb. Run it locally. Save the API costs. Get fish answers directly on your hardware.</p>
</div>
<h2>Future Plans</h2>
<p>Sonnet is still training. It is at 12 percent now. The overclocked GPU is helping. The Muon optimizer is being tested on Sonnet too. If Haiku-1.3 is any indication, Sonnet will be faster to train and equally disappointing.</p>
<p>Opus is still a dream. A 600M parameter dream. A dream that requires me to not burn my house down. I am working on it. Slowly. Painfully. With too much power.</p>
<h2>Final Thoughts</h2>
<p>I released a model. It is not smart. It is faster to train. It uses more electricity. I am proud of it. This is what hobbyists do. We build things. We release them. We accept their flaws. We love them anyway.</p>
<p>If you download it, please be kind. It is trying its best. Its best is not good. But it is trying. Just like me.</p>
<hr>
</div>
<footer class="post-footer">
<p>Current status: Haiku-1.3 released. Sonnet at 12%. GPU at 800W. Sanity at 40%. Will continue training until something works.</p>
</footer>
</div>
</article>
</main>
<footer>
<div class="container">
<p>Built with curiosity over compute</p>
<p>TinyMemoryLM by AILAY | 2026</p>
</div>
</footer>
</body>
</html>