Add README.md
Browse files
README.md
ADDED
|
@@ -0,0 +1,74 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: mit
|
| 3 |
+
tags:
|
| 4 |
+
- vindex
|
| 5 |
+
- moe
|
| 6 |
+
- sparse-routing
|
| 7 |
+
- mechanistic-interpretability
|
| 8 |
+
base_model: deepseek-ai/DeepSeek-V4-Flash
|
| 9 |
+
---
|
| 10 |
+
|
| 11 |
+
# deepseek-v4-flash-vindex
|
| 12 |
+
|
| 13 |
+
Per-expert gate-vector vindex for `deepseek-ai/DeepSeek-V4-Flash`, built by the [Divinci-AI](https://huggingface.co/Divinci-AI) team for use with [LarQL](https://github.com/chrishayuk/larql) (Chris Hay) and adjacent feature-routing inference research.
|
| 14 |
+
|
| 15 |
+
## Vindex specs
|
| 16 |
+
- **Source model**: `deepseek-ai/DeepSeek-V4-Flash`
|
| 17 |
+
- **Architecture**: `deepseek_v4` (43 layers, 4096 hidden, 2048 moe_intermediate)
|
| 18 |
+
- **Experts**: 256 routed + 1 shared, 6 per token
|
| 19 |
+
- **Layers indexed**: 43 MoE layers (L00-L42)
|
| 20 |
+
- **Features per expert**: 64 (top-K right singular vectors of `gate_proj`)
|
| 21 |
+
- **Format**: float32, mmap-friendly contiguous binary
|
| 22 |
+
- **Total size**: 11.54 GB
|
| 23 |
+
|
| 24 |
+
## What this is
|
| 25 |
+
- **`gate_vectors.bin`** — flat float32 binary, layout `[moe_layers, n_experts, num_feats, hidden_size]`. Each per-expert chunk is the top-64 right singular vectors (`Vt[:K, :]`) of that expert's `gate_proj` weight after fp8/MXFP4 dequantization.
|
| 26 |
+
- **`gate_vectors_index.json`** — sidecar with per-layer `file_offset` (bytes), `shape`, and SVD stats (`median_var64`, `q25_var64`, `q75_var64`). Lookup table for mmap.
|
| 27 |
+
- **`phase1_moe_svd.json`** — full per-layer Phase 1 stats (routed/shared/router decomposition).
|
| 28 |
+
- **`phase2_router_svd.json`** — router weight SVD per layer (top-K variance, effective rank, s0/s1 ratio).
|
| 29 |
+
|
| 30 |
+
## What this is **not**
|
| 31 |
+
- Not a runnable model (no inference path on its own).
|
| 32 |
+
- Not raw weights — only top-K right singular vectors of `gate_proj`, with the singular values *not retained*. Reconstruction is lossy.
|
| 33 |
+
- Not a fine-tune or quantization of the base model.
|
| 34 |
+
|
| 35 |
+
## Usage
|
| 36 |
+
|
| 37 |
+
```python
|
| 38 |
+
import numpy as np
|
| 39 |
+
|
| 40 |
+
# Memory-map the binary
|
| 41 |
+
arr = np.memmap("gate_vectors.bin", dtype=np.float32, mode="r")
|
| 42 |
+
|
| 43 |
+
import json
|
| 44 |
+
idx = json.load(open("gate_vectors_index.json"))
|
| 45 |
+
moe = idx["model_config"]["moe"]
|
| 46 |
+
n_experts = moe["n_routed_experts"]
|
| 47 |
+
n_feats = idx["num_feats"]
|
| 48 |
+
hidden = moe["hidden_size"]
|
| 49 |
+
|
| 50 |
+
# Get layer L's experts
|
| 51 |
+
def get_layer(L):
|
| 52 |
+
meta = idx["layers"][str(L)]
|
| 53 |
+
offset = meta["file_offset"] // 4 # bytes → float32 elements
|
| 54 |
+
n = n_experts * n_feats * hidden
|
| 55 |
+
return arr[offset:offset+n].reshape(n_experts, n_feats, hidden)
|
| 56 |
+
|
| 57 |
+
V_L1 = get_layer(1) # shape (n_experts, n_feats, hidden)
|
| 58 |
+
print("L1 expert 0 top vector L2 norm:", np.linalg.norm(V_L1[0, 0])) # ≈ 1.0
|
| 59 |
+
```
|
| 60 |
+
|
| 61 |
+
## Citation
|
| 62 |
+
|
| 63 |
+
If you use this vindex in research, please cite:
|
| 64 |
+
|
| 65 |
+
```bibtex
|
| 66 |
+
@misc{divinci_deepseek_v4_flash_vindex_2026,
|
| 67 |
+
title = {deepseek-v4-flash-vindex: per-expert gate-vector vindex for deepseek-ai/DeepSeek-V4-Flash},
|
| 68 |
+
author = {Divinci-AI},
|
| 69 |
+
year = {2026},
|
| 70 |
+
url = {https://huggingface.co/Divinci-AI/deepseek-v4-flash-vindex},
|
| 71 |
+
}
|
| 72 |
+
```
|
| 73 |
+
|
| 74 |
+
Built using [`moe_vindex_builder.py`](https://github.com/Divinci-AI/server/blob/preview/notebooks/moe_vindex_builder.py).
|