Spaces:

CompactAI-O
/

Homepage

Running

App Files Files Community

Homepage / I Released TMLM-Haiku-1.3 And It Is Still Dumb.html

CompactAI

Upload 107 files

259696a verified 4 days ago

raw

history blame contribute delete

15.5 kB

	<!DOCTYPE html>
	<html lang="en">
	<head>
	<meta charset="UTF-8">
	<meta name="viewport" content="width=device-width, initial-scale=1.0">
	<title>I Released TMLM-Haiku-1.3 And It Is Still Dumb \| TinyMemoryLM</title>
	<link rel="stylesheet" href="bluesheet.css">
	<link rel="preconnect" href="https://fonts.googleapis.com">
	<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
	<link href="https://fonts.googleapis.com/css2?family=Geist:wght@400;500;600;700&family=Geist+Mono&display=swap" rel="stylesheet">
	<style>

	:root {
	--blue-900: #000000;
	--blue-800: #0a0a0a;
	--blue-700: #111111;
	--blue-600: #1a1a1a;
	--blue-500: #333333;
	--blue-400: #555555;
	--blue-300: #777777;
	--blue-200: #888888;
	--blue-100: #aaaaaa;
	--white: #ffffff;
	--white-soft: #f5f5f5;
	--white-muted: #e0e0e0;
	--grid-line: rgba(255, 255, 255, 0.03);
	--grid-line-major: rgba(255, 255, 255, 0.06);
	--accent: #ededed;
	--accent-muted: #888888;
	--font-sans: 'Geist', -apple-system, BlinkMacSystemFont, sans-serif;
	--font-mono: 'Geist Mono', 'SF Mono', 'Fira Code', monospace;
	--container-max: 1100px;
	}
	* { box-sizing: border-box; margin: 0; padding: 0; }
	html { font-size: 16px; scroll-behavior: smooth; }
	body { font-family: var(--font-sans); background: var(--blue-900); color: var(--white-muted); line-height: 1.7; -webkit-font-smoothing: antialiased; }
	a { color: var(--white); text-decoration: none; transition: color 0.15s ease; }
	a:hover { color: var(--accent); }
	.container { max-width: var(--container-max); margin: 0 auto; padding: 0 24px; }
	nav { position: fixed; top: 0; left: 0; right: 0; z-index: 100; background: rgba(0, 0, 0, 0.85); backdrop-filter: blur(12px); border-bottom: 1px solid var(--blue-600); padding: 16px 0; }
	nav .container { display: flex; justify-content: space-between; align-items: center; }
	.nav-brand { font-size: 18px; font-weight: 600; color: var(--white); display: flex; align-items: center; gap: 8px; }
	.nav-brand span { color: var(--accent); }
	.nav-links { display: flex; gap: 32px; }
	.nav-links a { font-size: 14px; font-weight: 500; color: var(--blue-200); }
	.nav-links a:hover { color: var(--white); }
	.post { padding: 140px 0 80px; }
	.post-back { display: inline-block; color: var(--blue-200); font-size: 14px; margin-bottom: 32px; }
	.post-back:hover { color: var(--accent); }
	.post-back::before { content: '← '; }
	.post-meta { display: flex; gap: 12px; margin-bottom: 20px; }
	.post-date { font-size: 13px; color: var(--blue-200); font-family: var(--font-mono); }
	.post-tag { font-size: 11px; font-weight: 600; text-transform: uppercase; letter-spacing: 0.05em; color: var(--white); background: rgba(255, 255, 255, 0.08); padding: 4px 10px; border-radius: 4px; }
	.post h1 { font-size: 36px; font-weight: 700; color: var(--white); margin-bottom: 32px; line-height: 1.2; letter-spacing: -0.02em; }
	.post-body p { font-size: 17px; line-height: 1.8; margin-bottom: 24px; color: var(--blue-200); }
	.post-body p:first-of-type { font-size: 20px; color: var(--white-muted); }
	.post-body h2 { font-size: 24px; font-weight: 600; color: var(--white); margin: 48px 0 20px; }
	.post-body blockquote { border-left: 3px solid var(--accent); padding: 20px 24px; margin: 32px 0; background: var(--blue-800); border-radius: 0 8px 8px 0; }
	.post-body blockquote p { font-size: 16px; font-style: italic; color: var(--blue-200); margin: 0; }
	.post-body hr { border: none; height: 1px; background: var(--blue-600); margin: 48px 0; }
	.code-block { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; margin: 24px 0; font-family: var(--font-mono); font-size: 13px; overflow-x: auto; }
	.code-block .comment { color: var(--blue-200); font-style: italic; display: block; margin-top: 4px; }
	.stats-grid { display: grid; grid-template-columns: 1fr 1fr; gap: 16px; margin: 24px 0; }
	.stat-card { background: var(--blue-800); border: 1px solid var(--blue-600); border-radius: 8px; padding: 20px; text-align: center; }
	.stat-card .number { font-size: 32px; font-weight: 700; color: var(--accent); font-family: var(--font-mono); }
	.stat-card .label { font-size: 13px; color: var(--blue-200); margin-top: 8px; }
	.cta-box { background: var(--blue-800); border: 2px solid var(--accent); border-radius: 12px; padding: 24px; margin: 32px 0; text-align: center; }
	.cta-box a { color: var(--accent); font-weight: 600; font-size: 18px; word-break: break-all; }
	.cta-box a:hover { color: var(--white); }
	.cta-box p { margin: 12px 0 0; color: var(--blue-200); font-size: 14px; }
	.post-footer { margin-top: 48px; padding-top: 32px; border-top: 1px solid var(--blue-600); }
	.post-footer p { font-size: 14px; color: var(--blue-200); font-style: italic; margin: 0; }
	footer { padding: 40px 0; background: var(--blue-800); border-top: 1px solid var(--blue-600); text-align: center; }
	footer p { color: var(--blue-200); font-size: 14px; margin-bottom: 8px; }
	footer a { color: var(--blue-200); }
	footer a:hover { color: var(--accent); }
	@media (max-width: 768px) { .post h1 { font-size: 28px; } .nav-links { display: none; } .stats-grid { grid-template-columns: 1fr; } }

	</style>

	</head>
	<body>
	<svg class="scribbles" viewBox="0 0 1440 900" preserveAspectRatio="xMidYMid slice">
	<path d="M100,50 Q150,30 200,60 T300,40 T400,70" fill="none" stroke="white" stroke-width="1"/>
	<path d="M800,200 Q850,180 900,210 T1000,190 T1100,220" fill="none" stroke="white" stroke-width="0.8"/>
	<path d="M200,700 Q250,680 300,710 T400,690 T500,720" fill="none" stroke="white" stroke-width="0.6"/>
	<path d="M1200,400 Q1250,380 1300,410 T1400,390" fill="none" stroke="white" stroke-width="0.7"/>
	<path d="M50,400 Q100,380 150,420 T250,400" fill="none" stroke="white" stroke-width="0.5"/>
	<circle cx="350" cy="150" r="30" fill="none" stroke="white" stroke-width="0.6"/>
	<circle cx="1100" cy="600" r="25" fill="none" stroke="white" stroke-width="0.5"/>
	<path d="M600,100 L620,80 L640,100 L660,80" fill="none" stroke="white" stroke-width="0.7"/>
	<path d="M1300,750 Q1320,730 1340,760 T1380,740" fill="none" stroke="white" stroke-width="0.5"/>
	<path d="M100,800 Q120,780 140,810 T180,790 T220,820" fill="none" stroke="white" stroke-width="0.6"/>
	<path d="M700,500 Q720,480 740,510 T780,490 T820,520" fill="none" stroke="white" stroke-width="0.4"/>
	<path d="M400,300 C420,280 440,320 460,300 C480,280 500,320 520,300" fill="none" stroke="white" stroke-width="0.5"/>
	<path d="M900,700 C920,680 940,720 960,700 C980,680 1000,720 1020,700" fill="none" stroke="white" stroke-width="0.6"/>
	<path d="M150,250 Q170,230 190,260 Q210,240 230,270" fill="none" stroke="white" stroke-width="0.4"/>
	<path d="M1050,100 Q1070,80 1090,110 Q1110,90 1130,120" fill="none" stroke="white" stroke-width="0.5"/>
	<path d="M500,850 C520,830 540,860 560,840 C580,820 600,860 620,840" fill="none" stroke="white" stroke-width="0.4"/>
	<path d="M1350,50 Q1370,30 1390,60 T1430,40" fill="none" stroke="white" stroke-width="0.5"/>
	<path d="M30,600 Q50,580 70,610 T110,590" fill="none" stroke="white" stroke-width="0.4"/>
	</svg>

	<nav>
	<div class="container">
	<a href="index.html" class="nav-brand"><span>/</span>TinyMemoryLM</a>
	<div class="nav-links">
	<a href="index.html">Home</a>
	<a href="blog.html">Blog</a>
	<a href="status.html">Status</a>
	</div>
	</div>
	</nav>
	<main>
	<article class="post">
	<div class="container">
	<a href="blog.html" class="post-back">Back to Blog</a>
	<header>
	<div class="post-meta">
	<span class="post-date">2026-03-23</span>
	<span class="post-tag">Model Releases</span>
	</div>
	<h1>I Released TMLM-Haiku-1.3 And It Is Still Dumb</h1>
	</header>
	<div class="post-body">
	<p>I released TMLM-Haiku-1.3 today. It is on Hugging Face. It is open weights. It is still completely devoid of intelligence. I trained it with Muon. I spent electricity. I generated heat. The model still thinks Paris is a person.</p>
	<p>You might ask why I keep doing this. You might ask why I versioned it to 1.3 instead of 2.0. You might ask why I used Muon instead of AdamW. I do not have good answers. I have weights.</p>
	<blockquote>
	<p>Progress is not always vertical. Sometimes it is horizontal. Sometimes it is circular. Sometimes it is just releasing the same dumb model with a different optimizer.</p>
	</blockquote>
	<h2>The Muon Experiment</h2>
	<p>AdamW is standard. SGD is classic. Muon is new. It claims better convergence for transformers. It claims to handle large batch sizes better. It claims to be worth the hype. I wanted to test the claim.</p>
	<p>I switched the optimizer. I kept the data. I kept the architecture. I kept the low expectations. The training loss went down faster. The validation loss still plateaued. The model still outputs fish facts when asked for math.</p>
	<div class="code-block">
	<span class="comment"># Training config comparison</span><br>
	Haiku-1.0: AdamW, 261 hours, 600W<br>
	Haiku-1.3: Muon, 198 hours, 800W<br>
	<span class="comment"># Faster training. More power. Same stupidity.</span>
	</div>
	<p>The training finished in 198 hours instead of 261. That is a twenty-four percent speedup. I attribute this to Muon. I also attribute it to the 800W overclocked VBIOS I flashed last week. The GPU was screaming. The loss was descending. The result is unchanged.</p>
	<h2>Intelligence Report</h2>
	<div class="stats-grid">
	<div class="stat-card">
	<div class="number">0%</div>
	<div class="label">Intelligence Gain</div>
	</div>
	<div class="stat-card">
	<div class="number">24%</div>
	<div class="label">Training Speedup</div>
	</div>
	<div class="stat-card">
	<div class="number">100%</div>
	<div class="label">Still Hallucinates</div>
	</div>
	<div class="stat-card">
	<div class="number">1.3</div>
	<div class="label">Version Number</div>
	</div>
	</div>
	<p>I tested it. I asked simple questions. It gave complex wrong answers. It is confident. It is fluent. It is incorrect. This is the hallmark of a modern language model. I have successfully replicated industry standards in my bedroom.</p>
	<h2>Why Version 1.3</h2>
	<p>Version 2.0 implies improvement. Version 2.0 implies a new architecture. Version 2.0 implies I solved something. I did not solve anything. I changed the optimizer. I tweaked the learning rate schedule. I added more dropout.</p>
	<p>Version 1.3 is honest. It says this is a minor update. It says do not expect miracles. It says the fish facts are still included at no extra cost. I value honesty in versioning.</p>
	<h2>The Hardware Impact</h2>
	<p>This model was trained on the Astral ROG RTX 5090 OC LC. The one with the Matrix VBIOS. The one running at 800W. The one that heats my room like a furnace. The Muon optimizer allowed larger batch sizes. Larger batch sizes meant more VRAM usage. More VRAM usage meant the 800W power limit was fully utilized.</p>
	<p>My electricity bill hates me. My GPU loves me. The model does not care. It exists. It consumes tokens. It produces nonsense. It is alive in the way a spreadsheet is alive.</p>
	<blockquote>
	<p>I spent eight hundred watts to make a model that cannot count. This is art. This is science. This is a waste of money. All three can be true.</p>
	</blockquote>
	<h2>What Changed</h2>
	<p>Technically? The loss curve is smoother. The gradients are more stable. The training did not NaN this time. I consider this a major victory. After the NaN disaster of last week, a completed training run feels like a miracle.</p>
	<p>Functionally? Nothing. It still does not know the capital of France. It still thinks two plus two is a philosophical question. It still apologizes profusely when it is wrong. Then it gives another wrong answer.</p>
	<h2>Download It If You Want</h2>
	<div class="cta-box">
	<a href="https://huggingface.co/CompactAI/TMLM-Haiku-1.3" target="_blank">https://huggingface.co/CompactAI/TMLM-Haiku-1.3</a>
	<p>Free. Open weights. Trained with Muon. Still dumb. Run it locally. Save the API costs. Get fish answers directly on your hardware.</p>
	</div>
	<h2>Future Plans</h2>
	<p>Sonnet is still training. It is at 12 percent now. The overclocked GPU is helping. The Muon optimizer is being tested on Sonnet too. If Haiku-1.3 is any indication, Sonnet will be faster to train and equally disappointing.</p>
	<p>Opus is still a dream. A 600M parameter dream. A dream that requires me to not burn my house down. I am working on it. Slowly. Painfully. With too much power.</p>
	<h2>Final Thoughts</h2>
	<p>I released a model. It is not smart. It is faster to train. It uses more electricity. I am proud of it. This is what hobbyists do. We build things. We release them. We accept their flaws. We love them anyway.</p>
	<p>If you download it, please be kind. It is trying its best. Its best is not good. But it is trying. Just like me.</p>
	<hr>
	</div>
	<footer class="post-footer">
	<p>Current status: Haiku-1.3 released. Sonnet at 12%. GPU at 800W. Sanity at 40%. Will continue training until something works.</p>
	</footer>
	</div>
	</article>
	</main>
	<footer>
	<div class="container">
	<p>Built with curiosity over compute</p>
	<p>TinyMemoryLM by AILAY \| 2026</p>
	</div>
	</footer>
	</body>
	</html>