File size: 10,616 Bytes
8b5a46d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bfc9737
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4b002a1
 
bfc9737
8b5a46d
bfc9737
22b185f
 
 
 
 
 
 
 
 
4b002a1
ba09059
c01058f
22b185f
 
bfc9737
8b5a46d
22b185f
bfc9737
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
8b5a46d
 
bfc9737
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Divinci AI</title>
<style>
  body { font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', sans-serif; max-width: 860px; margin: 0 auto; padding: 2rem 1.5rem; color: #1a1a1a; line-height: 1.6; }
  h1 { font-size: 1.8rem; font-weight: 700; margin-bottom: 0.25rem; }
  h2 { font-size: 1.1rem; font-weight: 600; margin-top: 2rem; margin-bottom: 0.5rem; border-bottom: 1px solid #e5e7eb; padding-bottom: 0.25rem; }
  p { margin: 0.5rem 0 1rem; }
  table { border-collapse: collapse; width: 100%; margin: 1rem 0; font-size: 0.9rem; }
  th, td { border: 1px solid #e5e7eb; padding: 0.5rem 0.75rem; text-align: left; }
  th { background: #f9fafb; font-weight: 600; }
  a { color: #2563eb; text-decoration: none; }
  a:hover { text-decoration: underline; }
  .tagline { color: #6b7280; font-size: 1rem; margin-bottom: 1.5rem; }
  .footer { margin-top: 2.5rem; padding-top: 1rem; border-top: 1px solid #e5e7eb; font-size: 0.85rem; color: #6b7280; }
  hr { border: none; border-top: 1px solid #e5e7eb; margin: 1.5rem 0; }
</style>
</head>
<body>
<h1 id="divinci-ai">Divinci AI</h1>
<p class="tagline">Feature-level interpretability artifacts for open transformers β€”
built openly, validated empirically.</p>
<p>A <strong>vindex</strong> is a transformer's weights decompiled into
a queryable feature database. It exposes the entity associations,
circuit structure, and knowledge-editing surfaces that live inside a
model's FFN layers β€” without requiring GPU inference for most
operations.</p>
<p>Think of it as the model's index: the thing you search before you run
it.</p>
<hr />
<h2 id="interactive-viewer">Interactive viewer</h2>
<p><a href="https://huggingface.co/spaces/Divinci-AI/vindex-viewer"><img
src="https://huggingface.co/spaces/Divinci-AI/vindex-viewer/resolve/main/vindex-hero-bg.gif"
alt="LarQL Vindex Viewer β€” interactive 3D + 2D circuit visualization" /></a></p>
<p><strong><a
href="https://huggingface.co/spaces/Divinci-AI/vindex-viewer">β†’ Open the
interactive viewer</a></strong></p>
<p>Pick any of 9 models from the dropdown. Toggle between the 3D
cylinder spiral and a flat 2D circuit/network view. Hit <strong>β‡Œ
Compare</strong> to render the current model alongside Bonsai 1-bit,
side-by-side β€” the contrast between fp16 structure (organized rings) and
1-bit dissolution (scattered cloud) is the most direct picture of what
1-bit training does to a transformer's internal organization that we
know how to render. Search for entity features
(<code>?q=paris&amp;model=gemma-4-e2b</code>) to see real probe-derived
activations light up across the layer stack β€” backed by a 5000-token
offline-built search index.</p>
<hr />
<h2 id="published-vindexes">Published vindexes</h2>
<p>Cross-family evidence in hand: <strong>Gemma</strong>,
<strong>Qwen3</strong>, <strong>Mistral</strong>,
<strong>Llama</strong>, <strong>OpenAI MoE</strong>,
<strong>Moonshot MoE</strong>, <strong>DeepSeek-V4 MoE</strong>, plus two 1-bit
controls.</p>
<table>
<tbody>
<tr><td><strong>MODEL</strong></td><td><strong>ARCHITECTURE</strong></td><td><strong>PARAMS</strong></td><td><strong>VINDEX</strong></td><td><strong>C4 (LAYER TEMP)</strong></td><td><strong>NOTES</strong></td></tr>
<tr><td><strong>Gemma 4 E2B-it</strong></td><td>Dense (Gemma 4)</td><td>2B</td><td><a href="https://huggingface.co/Divinci-AI/gemma-4-e2b-vindex">gemma-4-e2b-vindex</a></td><td><strong>0.0407 Β± 0.0004</strong> βœ“</td><td>3-seed validated; headline universal-constant model</td></tr>
<tr><td>Qwen3-0.6B</td><td>Dense (Qwen 3)</td><td>0.6B</td><td><a href="https://huggingface.co/Divinci-AI/qwen3-0.6b-vindex">qwen3-0.6b-vindex</a></td><td>0.411</td><td>Smallest published; Qwen3 family-elevated C4</td></tr>
<tr><td>Qwen3-8B bf16</td><td>Dense (Qwen 3)</td><td>8B</td><td><a href="https://huggingface.co/Divinci-AI/qwen3-8b-vindex">qwen3-8b-vindex</a></td><td>0.804</td><td>Architecture control for Bonsai</td></tr>
<tr><td>Qwen3.6-35B-A3B</td><td>MoE (Qwen 3.6)</td><td>35B / 3B active</td><td><a href="https://huggingface.co/Divinci-AI/qwen3.6-35b-a3b-vindex">qwen3.6-35b-a3b-vindex</a></td><td>β€”</td><td>256 experts, 40 layers</td></tr>
<tr><td>Ministral-3B</td><td>Dense (Mistral 3)</td><td>3B</td><td><a href="https://huggingface.co/Divinci-AI/ministral-3b-vindex">ministral-3b-vindex</a></td><td>0.265</td><td>fp8 β†’ bf16 reconstruction</td></tr>
<tr><td>Llama 3.1-8B</td><td>Dense (Llama 3.1)</td><td>8B</td><td><a href="https://huggingface.co/Divinci-AI/llama-3.1-8b-vindex">llama-3.1-8b-vindex</a></td><td><strong>0.012</strong> βœ“</td><td>Llama family signature</td></tr>
<tr><td>MedGemma 1.5-4B</td><td>Dense (Gemma multimodal)</td><td>4B</td><td><a href="https://huggingface.co/Divinci-AI/medgemma-1.5-4b-vindex">medgemma-1.5-4b-vindex</a></td><td><strong>1.898 ⚠</strong></td><td>45Γ— cohort anomaly β€” under investigation</td></tr>
<tr><td>GPT-OSS 120B</td><td>MoE (OpenAI)</td><td>120B</td><td><a href="https://huggingface.co/Divinci-AI/gpt-oss-120b-vindex">gpt-oss-120b-vindex</a></td><td>β€”</td><td>S[0] grows 117Γ— with depth (L0=111 β†’ final=13,056)</td></tr>
<tr><td><strong>Kimi-K2-Instruct</strong></td><td>MoE fp8-native (DeepSeek-V3 style)</td><td>1T / 32B active</td><td><a href="https://huggingface.co/Divinci-AI/kimi-k2-instruct-vindex">kimi-k2-instruct-vindex</a></td><td><strong>0.0938</strong> (MoE median)</td><td>60 MoE layers; 42.28 GB gate_proj binary; broader L52–L60 secondary rise than initial dome SVD suggested</td></tr>
<tr><td><strong>DeepSeek-V4-Flash</strong></td><td>MoE MXFP4 (DeepSeek-V4)</td><td>43L / 256 experts / 6 active</td><td><a href="https://huggingface.co/Divinci-AI/deepseek-v4-flash-vindex">deepseek-v4-flash-vindex</a></td><td><strong>0.108</strong> (MoE median)</td><td>43-layer all-MoE; 11.54 GB gate_proj binary; first-peak L18 + double-bend profile (distinct from Kimi smooth dome); MXFP4 expert unpacking</td></tr>
<tr><td><strong>DeepSeek-V4-Pro</strong></td><td>MoE MXFP4 (DeepSeek-V4)</td><td>61L / 384 experts / 6 active</td><td><a href="https://huggingface.co/Divinci-AI/deepseek-v4-pro-vindex">deepseek-v4-pro-vindex</a></td><td><strong>0.0653</strong> (MoE median)</td><td>61-layer all-MoE; 42.98 GB gate_proj binary; lowest var@64 of 3 published MoE vindexes (V4-Pro 0.065 < Kimi 0.094 < V4-Flash 0.108) β€” V4-Pro experts are most shared/redundant; late secondary rise L53–L60</td></tr>
<tr><td><strong>Bonsai 8B</strong></td><td>1-bit (Qwen 3 base, post-quantized)</td><td>8B</td><td><em>vindex pending publish</em></td><td>0.429</td><td><strong>C5 = 1</strong> (circuit dissolved); var@64 = 0.093</td></tr>
<tr><td><strong>BitNet b1.58-2B-4T</strong></td><td>1-bit (Microsoft, native)</td><td>2B</td><td><em>vindex pending publish</em></td><td>(Phase 2 pending)</td><td><strong>var@64 = 0.111</strong> mean across 30 layers β€” n=2 confirmation of dissolution</td></tr>
</tbody>
</table>

<hr />
<h2 id="whats-a-vindex">What's a vindex?</h2>
<p>Standard model weights tell you <em>what</em> a model computes. A
vindex tells you <em>where</em> it stores specific knowledge and
<em>which features</em> need to change for a targeted edit.</p>
<p>Concretely: given a query like <code>"Paris β†’ capital"</code>, a
vindex walk returns the layers, feature directions, and token
associations that encode that fact. A patch operation writes a rank-1 Ξ”W
that suppresses or overwrites that association β€” compiled back to
standard HuggingFace safetensors for inference.</p>
<p>LarQL (the toolchain that builds vindexes) is open-source: <a
href="https://github.com/chrishayuk/larql">github.com/chrishayuk/larql</a>
| <a
href="https://github.com/Divinci-AI/larql">github.com/Divinci-AI/larql</a>.</p>
<hr />
<h2 id="research">Research</h2>
<h3
id="paper-1--architectural-invariants-of-transformer-computation">Paper
1 β€” <em>Architectural Invariants of Transformer Computation</em></h3>
<p><em>arXiv preprint forthcoming</em></p>
<p>Five properties measured across every model in this collection.
<strong>Three hold within Β±15% coefficient of variation</strong> across
architectures, organizations, and scales. <strong>One collapses under
1-bit quantization</strong> β€” replicated across two independent 1-bit
models from two organizations (n = 2). <strong>One scales monotonically
with model size</strong>.</p>
<p>The headline universal constant β€” layer temperature C4 β€” is
reproducible at the <strong>1% precision level</strong>: a three-seed
run on Gemma 4 E2B gives <code>C4 = 0.0407 Β± 0.0004</code>, with
circuit-stage count perfectly stable (<code>C5 = 4 Β± 0</code>) across
all seeds.</p>
<h3 id="paper-2--constellation-edits">Paper 2 β€” <em>Constellation
Edits</em></h3>
<p><em>draft, arXiv after 3-seed runs + Ξ±-sweep appendix</em></p>
<p>Mechanistic knowledge editing in transformer feature space. Includes
a negative result: why activation-space edits fail in 1-bit models, and
what weight-space geometry reveals about why.</p>
<h3 id="companion-blog-series--the-interpretability-diaries">Companion
blog series β€” <em>The Interpretability Diaries</em></h3>
<ul>
<li><a
href="https://divinci.ai/blog/architecture-every-llm-converges-to/">Part
I β€” The Architecture Every Language Model Converges To</a> β€” five
universal constants, what holds and what doesn't</li>
<li><a
href="https://divinci.ai/blog/deleting-paris-from-a-language-model/">Part
II β€” Deleting Paris from a Language Model</a> β€” Gate-3 surgical
knowledge edit with a receipt; rank-1 Ξ”W that suppresses one fact at
+0.02% perplexity</li>
<li><a href="https://divinci.ai/blog/when-the-circuit-dissolves/">Part
III β€” When the Circuit Dissolves</a> β€” two natively-trained 1-bit
models, two organizations, same dissolution: var@64 β‰ˆ 0.10 vs ~0.85 for
fp16</li>
</ul>
<p>Working notebooks: <a
href="https://github.com/Divinci-AI/server/tree/preview/notebooks">github.com/Divinci-AI/server/tree/preview/notebooks</a></p>
<hr />
<h2 id="working-in-public">Working in public</h2>
<p>Every measurement in our papers traces back to a notebook and a
commit. Negative results ship alongside positive ones β€” the MLP
compensation mechanism that defeats knowledge editing in 1-bit models is
in the notebooks, not buried in a supplement.</p>
<p>If you replicate a result and find a discrepancy, open an issue on
the LarQL repo.</p>
<hr />
<p><em>Vindexes on this org are free for academic and research use
(CC-BY-NC 4.0). Commercial licensing: <a
href="mailto:mike@divinci.ai">mike@divinci.ai</a></em></p>

</body>
</html>