mikeumus-divincian commited on
Commit
4316fe4
·
verified ·
1 Parent(s): 4d71eeb

Add README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -0
README.md ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ tags:
4
+ - vindex
5
+ - moe
6
+ - sparse-routing
7
+ - mechanistic-interpretability
8
+ base_model: moonshotai/Kimi-K2-Instruct
9
+ ---
10
+
11
+ # kimi-k2-instruct-vindex
12
+
13
+ Per-expert gate-vector vindex for `moonshotai/Kimi-K2-Instruct`, built by the [Divinci-AI](https://huggingface.co/Divinci-AI) team for use with [LarQL](https://github.com/chrishayuk/larql) (Chris Hay) and adjacent feature-routing inference research.
14
+
15
+ ## Vindex specs
16
+ - **Source model**: `moonshotai/Kimi-K2-Instruct`
17
+ - **Architecture**: `kimi_k2` (61 layers, 7168 hidden, 2048 moe_intermediate)
18
+ - **Experts**: 384 routed + 1 shared, 8 per token
19
+ - **Layers indexed**: 60 MoE layers (L01-L60)
20
+ - **Features per expert**: 64 (top-K right singular vectors of `gate_proj`)
21
+ - **Format**: float32, mmap-friendly contiguous binary
22
+ - **Total size**: 42.28 GB
23
+
24
+ ## What this is
25
+ - **`gate_vectors.bin`** — flat float32 binary, layout `[moe_layers, n_experts, num_feats, hidden_size]`. Each per-expert chunk is the top-64 right singular vectors (`Vt[:K, :]`) of that expert's `gate_proj` weight after fp8/MXFP4 dequantization.
26
+ - **`gate_vectors_index.json`** — sidecar with per-layer `file_offset` (bytes), `shape`, and SVD stats (`median_var64`, `q25_var64`, `q75_var64`). Lookup table for mmap.
27
+ - **`phase1_moe_svd.json`** — full per-layer Phase 1 stats (routed/shared/router decomposition).
28
+ - **`phase2_router_svd.json`** — router weight SVD per layer (top-K variance, effective rank, s0/s1 ratio).
29
+
30
+ ## What this is **not**
31
+ - Not a runnable model (no inference path on its own).
32
+ - Not raw weights — only top-K right singular vectors of `gate_proj`, with the singular values *not retained*. Reconstruction is lossy.
33
+ - Not a fine-tune or quantization of the base model.
34
+
35
+ ## Usage
36
+
37
+ ```python
38
+ import numpy as np
39
+
40
+ # Memory-map the binary
41
+ arr = np.memmap("gate_vectors.bin", dtype=np.float32, mode="r")
42
+
43
+ import json
44
+ idx = json.load(open("gate_vectors_index.json"))
45
+ moe = idx["model_config"]["moe"]
46
+ n_experts = moe["n_routed_experts"]
47
+ n_feats = idx["num_feats"]
48
+ hidden = moe["hidden_size"]
49
+
50
+ # Get layer L's experts
51
+ def get_layer(L):
52
+ meta = idx["layers"][str(L)]
53
+ offset = meta["file_offset"] // 4 # bytes → float32 elements
54
+ n = n_experts * n_feats * hidden
55
+ return arr[offset:offset+n].reshape(n_experts, n_feats, hidden)
56
+
57
+ V_L1 = get_layer(1) # shape (n_experts, n_feats, hidden)
58
+ print("L1 expert 0 top vector L2 norm:", np.linalg.norm(V_L1[0, 0])) # ≈ 1.0
59
+ ```
60
+
61
+ ## Citation
62
+
63
+ If you use this vindex in research, please cite:
64
+
65
+ ```bibtex
66
+ @misc{divinci_kimi_k2_instruct_vindex_2026,
67
+ title = {kimi-k2-instruct-vindex: per-expert gate-vector vindex for moonshotai/Kimi-K2-Instruct},
68
+ author = {Divinci-AI},
69
+ year = {2026},
70
+ url = {https://huggingface.co/Divinci-AI/kimi-k2-instruct-vindex},
71
+ }
72
+ ```
73
+
74
+ Built using [`moe_vindex_builder.py`](https://github.com/Divinci-AI/server/blob/preview/notebooks/moe_vindex_builder.py).