Qwen3.6-35B-A3B β LarQL Vindex v0.1
First published LarQL vindex for Qwen's Qwen3.6-35B-A3B MoE model.
A vindex is a transformer's weights decompiled into a queryable feature database β entity associations, circuit structure, and knowledge-editing surfaces exposed as APIs. No GPU required for most operations.
What this is / What this is not
What this IS:
- Feature-space index for Qwen3.6-35B-A3B (35B total, 3B active, 256 experts)
- Exposes entity associations via
/v1/walk - Enables rank-1 knowledge edits (DELETE/INSERT) via
/v1/patch - Source material for
larql compile into modelβ standard HuggingFace safetensors inference
What this IS NOT:
- A drop-in replacement for
Qwen/Qwen3.6-35B-A3B(use that for direct generation) - A text-generation engine β
/v1/inferreturns feature-modulated projections, not coherent completions
Quickstart
# Query entity associations
curl "$LARQL_SERVICE_URL/v1/walk?prompt=Paris&layers=10-30&top=10" \
-H "Authorization: Bearer $INTERNAL_LARQL_S2S_TOKEN"
# Apply a DELETE patch
curl -X POST "$LARQL_SERVICE_URL/v1/patches/apply" \
-H "Authorization: Bearer $INTERNAL_LARQL_S2S_TOKEN" \
-H "Content-Type: application/json" \
-d '{"name":"delete-example","patch":{"version":1,"base_model":"qwen3.6-35b-a3b","operations":[{"op":"delete","entity":"Paris","relation":"capital","layer":20,"feature":42}]}}'
# Compile to standard safetensors for inference on an edited model
larql compile into model \
--vindex Divinci-AI/qwen3.6-35b-a3b-vindex \
--output ./edited-qwen36 \
--format safetensors
Architecture Details
- Architecture: Qwen3.6 MoE (qwen3_5_moe)
- Layers: 40
- Hidden size: 2048
- Experts: 256 total, 8 active per token
- MoE intermediate size: 512 per expert
- Source weights: bf16 safetensors
- Feature aggregation: Router-weighted SVD across sampled experts, top-64 principal directions per layer
Research Findings
This vindex is part of the cross-architecture study in "Architectural Invariants of Transformer Computation: What Survives Scale, Training, and Quantization" (arXiv forthcoming).
Phase 1 SVD measurements (40 layers, 256 packed experts):
- var@64 range: 0.265β0.388 (mean 0.305)
- S[0] range: 0.9β1.5 (small absolute values β tight init on 512-dim experts)
- Consistent with bf16 MoE small-expert structure; substantially higher than MXFP4 quantized models (0.032β0.066)
Contents
| File | Size | Description |
|---|---|---|
| gate_vectors.bin | 10 MB | Aggregated gate feature directions, f16 [64 features Γ 2048 hidden per layer] |
| down_features.bin | 10 MB | Aggregated down-projection output directions, f16 |
| embeddings.bin | 970 MB | Token embeddings, 248,320 Γ 2048 (f16) |
| router_weights.bin | 41 MB | MoE router weights per layer, f16 |
| norms.bin | 324 KB | Per-layer normalization weights, f16 |
| down_meta.bin | 221 KB | Feature labels via vocab projection (top-10 tokens per feature) |
| index.json | 8 KB | Metadata: 40 layers, hidden=2048, 256 experts |
| manifest.json | 785 B | Vindex version manifest, SHA256 checksums |
Total: ~1.0 GB
Note on feature aggregation: Unlike dense-model vindexes (which store one row per FFN neuron), MoE vindexes store top-64 principal component directions aggregated across all 256 experts per layer. This keeps the artifact size tractable while preserving the dominant feature directions.
Roadmap
- Gate 3 validation β DELETE patch suppression test pending
- Feature clustering β k-means over gate features (not yet included in v0.1)
- Wikidata relation matching β deferred to v0.2
- Gemma 4 26B MoE vindex β in development
Built with LarQL
See Divinci-AI/larql and upstream chrishayuk/larql.
Citation
@misc{mooring2026invariants,
title={Architectural Invariants of Transformer Computation: What Survives Scale, Training, and Quantization},
author={Mooring, Mike},
year={2026},
note={arXiv forthcoming. See https://huggingface.co/Divinci-AI/qwen3.6-35b-a3b-vindex}
}
Acknowledgments
- Chris Hayuk for creating LarQL.
- Qwen team for Qwen3.6-35B-A3B.
License: CC-BY-NC 4.0. Academic and research use. Contact mike@divinci.ai for commercial licensing.