mikeumus-divincian commited on
Commit
bfc9737
Β·
verified Β·
1 Parent(s): 700f689

sync index.html to latest org card content (10 vindexes table)

Browse files
Files changed (1) hide show
  1. index.html +204 -43
index.html CHANGED
@@ -1,5 +1,4 @@
1
  <!DOCTYPE html>
2
- <!-- v2 -->
3
  <html lang="en">
4
  <head>
5
  <meta charset="UTF-8">
@@ -21,49 +20,211 @@
21
  </style>
22
  </head>
23
  <body>
24
-
25
- <h1>Divinci AI</h1>
26
- <p class="tagline">Feature-level interpretability artifacts for open transformers β€” built openly, validated empirically.</p>
27
-
28
- <p>A <strong>vindex</strong> is a transformer's weights decompiled into a queryable feature database. It exposes the entity associations, circuit structure, and knowledge-editing surfaces that live inside a model's FFN layers β€” without requiring GPU inference for most operations.</p>
29
- <p>Think of it as the model's index: the thing you search before you run it.</p>
30
-
31
- <hr>
32
-
33
- <h2>Published Vindexes</h2>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  <table>
35
- <tr><th>Model</th><th>Architecture</th><th>Params</th><th>Vindex</th></tr>
36
- <tr><td>Gemma 4 E2B-it</td><td>Dense (Gemma 4)</td><td>2B</td><td><a href="https://huggingface.co/Divinci-AI/gemma-4-e2b-vindex">gemma-4-e2b-vindex</a></td></tr>
37
- <tr><td>Qwen3.6-35B-A3B</td><td>MoE (Qwen3.6)</td><td>35B / 3B active</td><td><a href="https://huggingface.co/Divinci-AI/qwen3.6-35b-a3b-vindex">qwen3.6-35b-a3b-vindex</a></td></tr>
38
- <tr><td>GPT-OSS 120B</td><td>MoE (OpenAI)</td><td>120B / ~13B active</td><td><em>building</em></td></tr>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  </table>
40
- <p>Three organizations, three architectures: Gemma dense, Qwen MoE, OpenAI MoE.</p>
41
-
42
- <hr>
43
-
44
- <h2>What's a Vindex?</h2>
45
- <p>Standard model weights tell you <em>what</em> a model computes. A vindex tells you <em>where</em> it stores specific knowledge and <em>which features</em> need to change for a targeted edit.</p>
46
- <p>Concretely: given a query like <code>Paris β†’ capital</code>, a vindex walk returns the layers, feature directions, and token associations that encode that fact. A patch operation writes a rank-1 Ξ”W that suppresses or overwrites that association β€” compiled back to standard HuggingFace safetensors for inference.</p>
47
- <p>LarQL (the toolchain that builds vindexes) is open-source: <a href="https://github.com/chrishayuk/larql">chrishayuk/larql</a> Β· <a href="https://github.com/Divinci-AI/larql">Divinci-AI/larql</a>.</p>
48
-
49
- <hr>
50
-
51
- <h2>Research</h2>
52
- <p><strong>Paper 1 β€” Architectural Invariants of Transformer Computation</strong> <em>(arXiv forthcoming)</em><br>
53
- Five properties measured across every model in this collection. Three hold within Β±15% coefficient of variation across architectures, organizations, and scales. One collapses under 1-bit quantization. One scales monotonically with model size.</p>
54
- <p><strong>Paper 2 β€” Constellation Edits</strong> <em>(draft)</em><br>
55
- Mechanistic knowledge editing in transformer feature space. Includes a negative result: why activation-space edits fail in 1-bit models, and what weight-space geometry reveals about why.</p>
56
- <p>Working notebooks: <a href="https://github.com/Divinci-AI/server/tree/preview/notebooks">github.com/Divinci-AI/server/tree/preview/notebooks</a></p>
57
-
58
- <hr>
59
-
60
- <h2>Working in Public</h2>
61
- <p>Every measurement in our papers traces back to a notebook and a commit. Negative results ship alongside positive ones β€” the compensation mechanism that defeats knowledge editing in 1-bit models is in the notebooks, not buried in a supplement.</p>
62
- <p>If you replicate a result and find a discrepancy, open an issue on the LarQL repo.</p>
63
-
64
- <div class="footer">
65
- Vindexes on this org are free for academic and research use (CC-BY-NC 4.0). Commercial licensing: <a href="mailto:mike@divinci.ai">mike@divinci.ai</a>
66
- </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
67
 
68
  </body>
69
- </html>
 
1
  <!DOCTYPE html>
 
2
  <html lang="en">
3
  <head>
4
  <meta charset="UTF-8">
 
20
  </style>
21
  </head>
22
  <body>
23
+ <h1 id="divinci-ai">Divinci AI</h1>
24
+ <p class="tagline">Feature-level interpretability artifacts for open transformers β€”
25
+ built openly, validated empirically.</p>
26
+ <p>A <strong>vindex</strong> is a transformer's weights decompiled into
27
+ a queryable feature database. It exposes the entity associations,
28
+ circuit structure, and knowledge-editing surfaces that live inside a
29
+ model's FFN layers β€” without requiring GPU inference for most
30
+ operations.</p>
31
+ <p>Think of it as the model's index: the thing you search before you run
32
+ it.</p>
33
+ <hr />
34
+ <h2 id="interactive-viewer">Interactive viewer</h2>
35
+ <p><a href="https://huggingface.co/spaces/Divinci-AI/vindex-viewer"><img
36
+ src="https://huggingface.co/spaces/Divinci-AI/vindex-viewer/resolve/main/vindex-hero-bg.gif"
37
+ alt="LarQL Vindex Viewer β€” interactive 3D + 2D circuit visualization" /></a></p>
38
+ <p><strong><a
39
+ href="https://huggingface.co/spaces/Divinci-AI/vindex-viewer">β†’ Open the
40
+ interactive viewer</a></strong></p>
41
+ <p>Pick any of 9 models from the dropdown. Toggle between the 3D
42
+ cylinder spiral and a flat 2D circuit/network view. Hit <strong>β‡Œ
43
+ Compare</strong> to render the current model alongside Bonsai 1-bit,
44
+ side-by-side β€” the contrast between fp16 structure (organized rings) and
45
+ 1-bit dissolution (scattered cloud) is the most direct picture of what
46
+ 1-bit training does to a transformer's internal organization that we
47
+ know how to render. Search for entity features
48
+ (<code>?q=paris&amp;model=gemma-4-e2b</code>) to see real probe-derived
49
+ activations light up across the layer stack β€” backed by a 5000-token
50
+ offline-built search index.</p>
51
+ <hr />
52
+ <h2 id="published-vindexes">Published vindexes</h2>
53
+ <p>Cross-family evidence in hand: <strong>Gemma</strong>,
54
+ <strong>Qwen3</strong>, <strong>Mistral</strong>,
55
+ <strong>Llama</strong>, <strong>OpenAI MoE</strong>, plus two 1-bit
56
+ controls.</p>
57
  <table>
58
+ <thead>
59
+ <tr>
60
+ <th>Model</th>
61
+ <th>Architecture</th>
62
+ <th>Params</th>
63
+ <th>Vindex</th>
64
+ <th>C4 (layer temp)</th>
65
+ <th>Notes</th>
66
+ </tr>
67
+ </thead>
68
+ <tbody>
69
+ <tr>
70
+ <td><strong>Gemma 4 E2B-it</strong></td>
71
+ <td>Dense (Gemma 4)</td>
72
+ <td>2B</td>
73
+ <td><a
74
+ href="https://huggingface.co/Divinci-AI/gemma-4-e2b-vindex">gemma-4-e2b-vindex</a></td>
75
+ <td><strong>0.0407 Β± 0.0004</strong> βœ“</td>
76
+ <td>3-seed validated; headline universal-constant model</td>
77
+ </tr>
78
+ <tr>
79
+ <td>Qwen3-0.6B</td>
80
+ <td>Dense (Qwen 3)</td>
81
+ <td>0.6B</td>
82
+ <td><a
83
+ href="https://huggingface.co/Divinci-AI/qwen3-0.6b-vindex">qwen3-0.6b-vindex</a></td>
84
+ <td>0.411</td>
85
+ <td>Smallest published; Qwen3 family-elevated C4</td>
86
+ </tr>
87
+ <tr>
88
+ <td>Qwen3-8B bf16</td>
89
+ <td>Dense (Qwen 3)</td>
90
+ <td>8B</td>
91
+ <td><a
92
+ href="https://huggingface.co/Divinci-AI/qwen3-8b-vindex">qwen3-8b-vindex</a></td>
93
+ <td>0.804</td>
94
+ <td>Architecture control for Bonsai</td>
95
+ </tr>
96
+ <tr>
97
+ <td>Qwen3.6-35B-A3B</td>
98
+ <td>MoE (Qwen 3.6)</td>
99
+ <td>35B / 3B active</td>
100
+ <td><a
101
+ href="https://huggingface.co/Divinci-AI/qwen3.6-35b-a3b-vindex">qwen3.6-35b-a3b-vindex</a></td>
102
+ <td>β€”</td>
103
+ <td>256 experts, 40 layers</td>
104
+ </tr>
105
+ <tr>
106
+ <td>Ministral-3B</td>
107
+ <td>Dense (Mistral 3)</td>
108
+ <td>3B</td>
109
+ <td><a
110
+ href="https://huggingface.co/Divinci-AI/ministral-3b-vindex">ministral-3b-vindex</a></td>
111
+ <td>0.265</td>
112
+ <td>fp8 β†’ bf16 reconstruction</td>
113
+ </tr>
114
+ <tr>
115
+ <td>Llama 3.1-8B</td>
116
+ <td>Dense (Llama 3.1)</td>
117
+ <td>8B</td>
118
+ <td><a
119
+ href="https://huggingface.co/Divinci-AI/llama-3.1-8b-vindex">llama-3.1-8b-vindex</a></td>
120
+ <td><strong>0.012</strong> βœ“</td>
121
+ <td>Llama family signature</td>
122
+ </tr>
123
+ <tr>
124
+ <td>MedGemma 1.5-4B</td>
125
+ <td>Dense (Gemma multimodal)</td>
126
+ <td>4B</td>
127
+ <td><a
128
+ href="https://huggingface.co/Divinci-AI/medgemma-1.5-4b-vindex">medgemma-1.5-4b-vindex</a></td>
129
+ <td><strong>1.898 ⚠</strong></td>
130
+ <td>45Γ— cohort anomaly β€” under investigation</td>
131
+ </tr>
132
+ <tr>
133
+ <td>GPT-OSS 120B</td>
134
+ <td>MoE (OpenAI)</td>
135
+ <td>120B</td>
136
+ <td><a
137
+ href="https://huggingface.co/Divinci-AI/gpt-oss-120b-vindex">gpt-oss-120b-vindex</a></td>
138
+ <td>β€”</td>
139
+ <td>S[0] grows 117Γ— with depth (L0=111 β†’ final=13,056)</td>
140
+ </tr>
141
+ <tr>
142
+ <td><strong>Bonsai 8B</strong></td>
143
+ <td>1-bit (Qwen 3 base, post-quantized)</td>
144
+ <td>8B</td>
145
+ <td><em>vindex pending publish</em></td>
146
+ <td>0.429</td>
147
+ <td><strong>C5 = 1</strong> (circuit dissolved); var@64 = 0.093</td>
148
+ </tr>
149
+ <tr>
150
+ <td><strong>BitNet b1.58-2B-4T</strong></td>
151
+ <td>1-bit (Microsoft, native)</td>
152
+ <td>2B</td>
153
+ <td><em>vindex pending publish</em></td>
154
+ <td>(Phase 2 pending)</td>
155
+ <td><strong>var@64 = 0.111</strong> mean across 30 layers β€” n=2
156
+ confirmation of dissolution</td>
157
+ </tr>
158
+ </tbody>
159
  </table>
160
+ <hr />
161
+ <h2 id="whats-a-vindex">What's a vindex?</h2>
162
+ <p>Standard model weights tell you <em>what</em> a model computes. A
163
+ vindex tells you <em>where</em> it stores specific knowledge and
164
+ <em>which features</em> need to change for a targeted edit.</p>
165
+ <p>Concretely: given a query like <code>"Paris β†’ capital"</code>, a
166
+ vindex walk returns the layers, feature directions, and token
167
+ associations that encode that fact. A patch operation writes a rank-1 Ξ”W
168
+ that suppresses or overwrites that association β€” compiled back to
169
+ standard HuggingFace safetensors for inference.</p>
170
+ <p>LarQL (the toolchain that builds vindexes) is open-source: <a
171
+ href="https://github.com/chrishayuk/larql">github.com/chrishayuk/larql</a>
172
+ | <a
173
+ href="https://github.com/Divinci-AI/larql">github.com/Divinci-AI/larql</a>.</p>
174
+ <hr />
175
+ <h2 id="research">Research</h2>
176
+ <h3
177
+ id="paper-1--architectural-invariants-of-transformer-computation">Paper
178
+ 1 β€” <em>Architectural Invariants of Transformer Computation</em></h3>
179
+ <p><em>arXiv preprint forthcoming</em></p>
180
+ <p>Five properties measured across every model in this collection.
181
+ <strong>Three hold within Β±15% coefficient of variation</strong> across
182
+ architectures, organizations, and scales. <strong>One collapses under
183
+ 1-bit quantization</strong> β€” replicated across two independent 1-bit
184
+ models from two organizations (n = 2). <strong>One scales monotonically
185
+ with model size</strong>.</p>
186
+ <p>The headline universal constant β€” layer temperature C4 β€” is
187
+ reproducible at the <strong>1% precision level</strong>: a three-seed
188
+ run on Gemma 4 E2B gives <code>C4 = 0.0407 Β± 0.0004</code>, with
189
+ circuit-stage count perfectly stable (<code>C5 = 4 Β± 0</code>) across
190
+ all seeds.</p>
191
+ <h3 id="paper-2--constellation-edits">Paper 2 β€” <em>Constellation
192
+ Edits</em></h3>
193
+ <p><em>draft, arXiv after 3-seed runs + Ξ±-sweep appendix</em></p>
194
+ <p>Mechanistic knowledge editing in transformer feature space. Includes
195
+ a negative result: why activation-space edits fail in 1-bit models, and
196
+ what weight-space geometry reveals about why.</p>
197
+ <h3 id="companion-blog-series--the-interpretability-diaries">Companion
198
+ blog series β€” <em>The Interpretability Diaries</em></h3>
199
+ <ul>
200
+ <li><a
201
+ href="https://divinci.ai/blog/architecture-every-llm-converges-to/">Part
202
+ I β€” The Architecture Every Language Model Converges To</a> β€” five
203
+ universal constants, what holds and what doesn't</li>
204
+ <li><a
205
+ href="https://divinci.ai/blog/deleting-paris-from-a-language-model/">Part
206
+ II β€” Deleting Paris from a Language Model</a> β€” Gate-3 surgical
207
+ knowledge edit with a receipt; rank-1 Ξ”W that suppresses one fact at
208
+ +0.02% perplexity</li>
209
+ <li><a href="https://divinci.ai/blog/when-the-circuit-dissolves/">Part
210
+ III β€” When the Circuit Dissolves</a> β€” two natively-trained 1-bit
211
+ models, two organizations, same dissolution: var@64 β‰ˆ 0.10 vs ~0.85 for
212
+ fp16</li>
213
+ </ul>
214
+ <p>Working notebooks: <a
215
+ href="https://github.com/Divinci-AI/server/tree/preview/notebooks">github.com/Divinci-AI/server/tree/preview/notebooks</a></p>
216
+ <hr />
217
+ <h2 id="working-in-public">Working in public</h2>
218
+ <p>Every measurement in our papers traces back to a notebook and a
219
+ commit. Negative results ship alongside positive ones β€” the MLP
220
+ compensation mechanism that defeats knowledge editing in 1-bit models is
221
+ in the notebooks, not buried in a supplement.</p>
222
+ <p>If you replicate a result and find a discrepancy, open an issue on
223
+ the LarQL repo.</p>
224
+ <hr />
225
+ <p><em>Vindexes on this org are free for academic and research use
226
+ (CC-BY-NC 4.0). Commercial licensing: <a
227
+ href="mailto:mike@divinci.ai">mike@divinci.ai</a></em></p>
228
 
229
  </body>
230
+ </html>