Spaces:

Divinci-AI
/

README

Configuration error

App Files Files Community

mikeumus-divincian commited on 24 days ago

Commit

bfc9737

verified ·

1 Parent(s): 700f689

sync index.html to latest org card content (10 vindexes table)

Browse files

Files changed (1) hide show

index.html +204 -43

index.html CHANGED Viewed

@@ -1,5 +1,4 @@
 <!DOCTYPE html>
-<!-- v2 -->
 <html lang="en">
 <head>
 <meta charset="UTF-8">
@@ -21,49 +20,211 @@
 </style>
 </head>
 <body>
-<h1>Divinci AI</h1>
-<p class="tagline">Feature-level interpretability artifacts for open transformers — built openly, validated empirically.</p>
-<p>A <strong>vindex</strong> is a transformer's weights decompiled into a queryable feature database. It exposes the entity associations, circuit structure, and knowledge-editing surfaces that live inside a model's FFN layers — without requiring GPU inference for most operations.</p>
-<p>Think of it as the model's index: the thing you search before you run it.</p>
-<hr>
-<h2>Published Vindexes</h2>
 <table>
-  <tr><th>Model</th><th>Architecture</th><th>Params</th><th>Vindex</th></tr>
-  <tr><td>Gemma 4 E2B-it</td><td>Dense (Gemma 4)</td><td>2B</td><td><a href="https://huggingface.co/Divinci-AI/gemma-4-e2b-vindex">gemma-4-e2b-vindex</a></td></tr>
-  <tr><td>Qwen3.6-35B-A3B</td><td>MoE (Qwen3.6)</td><td>35B / 3B active</td><td><a href="https://huggingface.co/Divinci-AI/qwen3.6-35b-a3b-vindex">qwen3.6-35b-a3b-vindex</a></td></tr>
-  <tr><td>GPT-OSS 120B</td><td>MoE (OpenAI)</td><td>120B / ~13B active</td><td><em>building</em></td></tr>
 </table>
-<p>Three organizations, three architectures: Gemma dense, Qwen MoE, OpenAI MoE.</p>
-<hr>
-<h2>What's a Vindex?</h2>
-<p>Standard model weights tell you <em>what</em> a model computes. A vindex tells you <em>where</em> it stores specific knowledge and <em>which features</em> need to change for a targeted edit.</p>
-<p>Concretely: given a query like <code>Paris → capital</code>, a vindex walk returns the layers, feature directions, and token associations that encode that fact. A patch operation writes a rank-1 ΔW that suppresses or overwrites that association — compiled back to standard HuggingFace safetensors for inference.</p>
-<p>LarQL (the toolchain that builds vindexes) is open-source: <a href="https://github.com/chrishayuk/larql">chrishayuk/larql</a> · <a href="https://github.com/Divinci-AI/larql">Divinci-AI/larql</a>.</p>
-<hr>
-<h2>Research</h2>
-<p><strong>Paper 1 — Architectural Invariants of Transformer Computation</strong> <em>(arXiv forthcoming)</em><br>
-Five properties measured across every model in this collection. Three hold within ±15% coefficient of variation across architectures, organizations, and scales. One collapses under 1-bit quantization. One scales monotonically with model size.</p>
-<p><strong>Paper 2 — Constellation Edits</strong> <em>(draft)</em><br>
-Mechanistic knowledge editing in transformer feature space. Includes a negative result: why activation-space edits fail in 1-bit models, and what weight-space geometry reveals about why.</p>
-<p>Working notebooks: <a href="https://github.com/Divinci-AI/server/tree/preview/notebooks">github.com/Divinci-AI/server/tree/preview/notebooks</a></p>
-<hr>
-<h2>Working in Public</h2>
-<p>Every measurement in our papers traces back to a notebook and a commit. Negative results ship alongside positive ones — the compensation mechanism that defeats knowledge editing in 1-bit models is in the notebooks, not buried in a supplement.</p>
-<p>If you replicate a result and find a discrepancy, open an issue on the LarQL repo.</p>
-<div class="footer">
-  Vindexes on this org are free for academic and research use (CC-BY-NC 4.0). Commercial licensing: <a href="mailto:mike@divinci.ai">mike@divinci.ai</a>
-</div>
 </body>
-</html>

 <!DOCTYPE html>
 <html lang="en">
 <head>
 <meta charset="UTF-8">
 </style>
 </head>
 <body>
+<h1 id="divinci-ai">Divinci AI</h1>
+<p class="tagline">Feature-level interpretability artifacts for open transformers —
+built openly, validated empirically.</p>
+<p>A <strong>vindex</strong> is a transformer's weights decompiled into
+a queryable feature database. It exposes the entity associations,
+circuit structure, and knowledge-editing surfaces that live inside a
+model's FFN layers — without requiring GPU inference for most
+operations.</p>
+<p>Think of it as the model's index: the thing you search before you run
+it.</p>
+<hr />
+<h2 id="interactive-viewer">Interactive viewer</h2>
+<p><a href="https://huggingface.co/spaces/Divinci-AI/vindex-viewer"><img
+src="https://huggingface.co/spaces/Divinci-AI/vindex-viewer/resolve/main/vindex-hero-bg.gif"
+alt="LarQL Vindex Viewer — interactive 3D + 2D circuit visualization" /></a></p>
+<p><strong><a
+href="https://huggingface.co/spaces/Divinci-AI/vindex-viewer">→ Open the
+interactive viewer</a></strong></p>
+<p>Pick any of 9 models from the dropdown. Toggle between the 3D
+cylinder spiral and a flat 2D circuit/network view. Hit <strong>⇌
+Compare</strong> to render the current model alongside Bonsai 1-bit,
+side-by-side — the contrast between fp16 structure (organized rings) and
+1-bit dissolution (scattered cloud) is the most direct picture of what
+1-bit training does to a transformer's internal organization that we
+know how to render. Search for entity features
+(<code>?q=paris&amp;model=gemma-4-e2b</code>) to see real probe-derived
+activations light up across the layer stack — backed by a 5000-token
+offline-built search index.</p>
+<hr />
+<h2 id="published-vindexes">Published vindexes</h2>
+<p>Cross-family evidence in hand: <strong>Gemma</strong>,
+<strong>Qwen3</strong>, <strong>Mistral</strong>,
+<strong>Llama</strong>, <strong>OpenAI MoE</strong>, plus two 1-bit
+controls.</p>
 <table>
+<thead>
+<tr>
+<th>Model</th>
+<th>Architecture</th>
+<th>Params</th>
+<th>Vindex</th>
+<th>C4 (layer temp)</th>
+<th>Notes</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td><strong>Gemma 4 E2B-it</strong></td>
+<td>Dense (Gemma 4)</td>
+<td>2B</td>
+<td><a
+href="https://huggingface.co/Divinci-AI/gemma-4-e2b-vindex">gemma-4-e2b-vindex</a></td>
+<td><strong>0.0407 ± 0.0004</strong> ✓</td>
+<td>3-seed validated; headline universal-constant model</td>
+</tr>
+<tr>
+<td>Qwen3-0.6B</td>
+<td>Dense (Qwen 3)</td>
+<td>0.6B</td>
+<td><a
+href="https://huggingface.co/Divinci-AI/qwen3-0.6b-vindex">qwen3-0.6b-vindex</a></td>
+<td>0.411</td>
+<td>Smallest published; Qwen3 family-elevated C4</td>
+</tr>
+<tr>
+<td>Qwen3-8B bf16</td>
+<td>Dense (Qwen 3)</td>
+<td>8B</td>
+<td><a
+href="https://huggingface.co/Divinci-AI/qwen3-8b-vindex">qwen3-8b-vindex</a></td>
+<td>0.804</td>
+<td>Architecture control for Bonsai</td>
+</tr>
+<tr>
+<td>Qwen3.6-35B-A3B</td>
+<td>MoE (Qwen 3.6)</td>
+<td>35B / 3B active</td>
+<td><a
+href="https://huggingface.co/Divinci-AI/qwen3.6-35b-a3b-vindex">qwen3.6-35b-a3b-vindex</a></td>
+<td>—</td>
+<td>256 experts, 40 layers</td>
+</tr>
+<tr>
+<td>Ministral-3B</td>
+<td>Dense (Mistral 3)</td>
+<td>3B</td>
+<td><a
+href="https://huggingface.co/Divinci-AI/ministral-3b-vindex">ministral-3b-vindex</a></td>
+<td>0.265</td>
+<td>fp8 → bf16 reconstruction</td>
+</tr>
+<tr>
+<td>Llama 3.1-8B</td>
+<td>Dense (Llama 3.1)</td>
+<td>8B</td>
+<td><a
+href="https://huggingface.co/Divinci-AI/llama-3.1-8b-vindex">llama-3.1-8b-vindex</a></td>
+<td><strong>0.012</strong> ✓</td>
+<td>Llama family signature</td>
+</tr>
+<tr>
+<td>MedGemma 1.5-4B</td>
+<td>Dense (Gemma multimodal)</td>
+<td>4B</td>
+<td><a
+href="https://huggingface.co/Divinci-AI/medgemma-1.5-4b-vindex">medgemma-1.5-4b-vindex</a></td>
+<td><strong>1.898 ⚠</strong></td>
+<td>45× cohort anomaly — under investigation</td>
+</tr>
+<tr>
+<td>GPT-OSS 120B</td>
+<td>MoE (OpenAI)</td>
+<td>120B</td>
+<td><a
+href="https://huggingface.co/Divinci-AI/gpt-oss-120b-vindex">gpt-oss-120b-vindex</a></td>
+<td>—</td>
+<td>S[0] grows 117× with depth (L0=111 → final=13,056)</td>
+</tr>
+<tr>
+<td><strong>Bonsai 8B</strong></td>
+<td>1-bit (Qwen 3 base, post-quantized)</td>
+<td>8B</td>
+<td><em>vindex pending publish</em></td>
+<td>0.429</td>
+<td><strong>C5 = 1</strong> (circuit dissolved); var@64 = 0.093</td>
+</tr>
+<tr>
+<td><strong>BitNet b1.58-2B-4T</strong></td>
+<td>1-bit (Microsoft, native)</td>
+<td>2B</td>
+<td><em>vindex pending publish</em></td>
+<td>(Phase 2 pending)</td>
+<td><strong>var@64 = 0.111</strong> mean across 30 layers — n=2
+confirmation of dissolution</td>
+</tr>
+</tbody>
 </table>
+<hr />
+<h2 id="whats-a-vindex">What's a vindex?</h2>
+<p>Standard model weights tell you <em>what</em> a model computes. A
+vindex tells you <em>where</em> it stores specific knowledge and
+<em>which features</em> need to change for a targeted edit.</p>
+<p>Concretely: given a query like <code>"Paris → capital"</code>, a
+vindex walk returns the layers, feature directions, and token
+associations that encode that fact. A patch operation writes a rank-1 ΔW
+that suppresses or overwrites that association — compiled back to
+standard HuggingFace safetensors for inference.</p>
+<p>LarQL (the toolchain that builds vindexes) is open-source: <a
+href="https://github.com/chrishayuk/larql">github.com/chrishayuk/larql</a>
+| <a
+href="https://github.com/Divinci-AI/larql">github.com/Divinci-AI/larql</a>.</p>
+<hr />
+<h2 id="research">Research</h2>
+<h3
+id="paper-1--architectural-invariants-of-transformer-computation">Paper
+1 — <em>Architectural Invariants of Transformer Computation</em></h3>
+<p><em>arXiv preprint forthcoming</em></p>
+<p>Five properties measured across every model in this collection.
+<strong>Three hold within ±15% coefficient of variation</strong> across
+architectures, organizations, and scales. <strong>One collapses under
+1-bit quantization</strong> — replicated across two independent 1-bit
+models from two organizations (n = 2). <strong>One scales monotonically
+with model size</strong>.</p>
+<p>The headline universal constant — layer temperature C4 — is
+reproducible at the <strong>1% precision level</strong>: a three-seed
+run on Gemma 4 E2B gives <code>C4 = 0.0407 ± 0.0004</code>, with
+circuit-stage count perfectly stable (<code>C5 = 4 ± 0</code>) across
+all seeds.</p>
+<h3 id="paper-2--constellation-edits">Paper 2 — <em>Constellation
+Edits</em></h3>
+<p><em>draft, arXiv after 3-seed runs + α-sweep appendix</em></p>
+<p>Mechanistic knowledge editing in transformer feature space. Includes
+a negative result: why activation-space edits fail in 1-bit models, and
+what weight-space geometry reveals about why.</p>
+<h3 id="companion-blog-series--the-interpretability-diaries">Companion
+blog series — <em>The Interpretability Diaries</em></h3>
+<ul>
+<li><a
+href="https://divinci.ai/blog/architecture-every-llm-converges-to/">Part
+I — The Architecture Every Language Model Converges To</a> — five
+universal constants, what holds and what doesn't</li>
+<li><a
+href="https://divinci.ai/blog/deleting-paris-from-a-language-model/">Part
+II — Deleting Paris from a Language Model</a> — Gate-3 surgical
+knowledge edit with a receipt; rank-1 ΔW that suppresses one fact at
++0.02% perplexity</li>
+<li><a href="https://divinci.ai/blog/when-the-circuit-dissolves/">Part
+III — When the Circuit Dissolves</a> — two natively-trained 1-bit
+models, two organizations, same dissolution: var@64 ≈ 0.10 vs ~0.85 for
+fp16</li>
+</ul>
+<p>Working notebooks: <a
+href="https://github.com/Divinci-AI/server/tree/preview/notebooks">github.com/Divinci-AI/server/tree/preview/notebooks</a></p>
+<hr />
+<h2 id="working-in-public">Working in public</h2>
+<p>Every measurement in our papers traces back to a notebook and a
+commit. Negative results ship alongside positive ones — the MLP
+compensation mechanism that defeats knowledge editing in 1-bit models is
+in the notebooks, not buried in a supplement.</p>
+<p>If you replicate a result and find a discrepancy, open an issue on
+the LarQL repo.</p>
+<hr />
+<p><em>Vindexes on this org are free for academic and research use
+(CC-BY-NC 4.0). Commercial licensing: <a
+href="mailto:mike@divinci.ai">mike@divinci.ai</a></em></p>
 </body>
+</html>