mikeumus-divincian commited on
Commit
fa094ae
·
verified ·
1 Parent(s): f1856a8

docs: clarify inference path via larql compile into model

Browse files
Files changed (1) hide show
  1. README.md +16 -2
README.md CHANGED
@@ -25,12 +25,26 @@ A **vindex** is a transformer's weights decompiled into a queryable feature data
25
  | A feature-space index for Gemma4-e2b-it | A language model |
26
  | Exposes entity associations via `/v1/walk` | `/v1/infer` does NOT produce factual completions |
27
  | Enables rank-1 knowledge edits (DELETE/INSERT) | Not a replacement for the base Gemma4 weights |
28
- | Circuit analysis (broadcast→domain→entity→prediction) | Not a general inference engine |
 
29
 
30
- **Critical note on `/v1/infer`:** This endpoint returns a feature-modulated projection of the host model's activations — not a coherent text-generation distribution. Output is incoherent subword tokens by design (the vindex is a feature graph, not a full transformer forward pass). For factual completion, use `google/gemma-4-e2b-it` directly. Use `/v1/walk` and `/v1/patch` for the validated operations this vindex is designed for.
31
 
32
  **Validated surfaces:** `/v1/walk` (entity-association retrieval), `/v1/describe` (feature neighborhood), `/v1/patch` DELETE/INSERT (rank-1 weight editing, Gate 3 confirmed).
33
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34
  ## Quick start
35
 
36
  ```bash
 
25
  | A feature-space index for Gemma4-e2b-it | A language model |
26
  | Exposes entity associations via `/v1/walk` | `/v1/infer` does NOT produce factual completions |
27
  | Enables rank-1 knowledge edits (DELETE/INSERT) | Not a replacement for the base Gemma4 weights |
28
+ | Circuit analysis (broadcast→domain→entity→prediction) |
29
+ | Editing surface for `larql compile into model` → standard HuggingFace safetensors inference | Not a general inference engine |
30
 
31
+ **Critical note on `/v1/infer`:** This endpoint returns a feature-modulated projection of the host model's activations — not a coherent text-generation distribution. Output is incoherent subword tokens by design (the vindex is a feature graph, not a full transformer forward pass). For factual text generation from the *base* model, use `google/gemma-4-e2b-it` directly. To run inference on an **edited** model (after DELETE/INSERT patches), use `larql compile into model` — this exports MEMIT-edited weights to HuggingFace safetensors that load like any standard `transformers` model. Use `/v1/walk` and `/v1/patch` for the validated vindex operations.
32
 
33
  **Validated surfaces:** `/v1/walk` (entity-association retrieval), `/v1/describe` (feature neighborhood), `/v1/patch` DELETE/INSERT (rank-1 weight editing, Gate 3 confirmed).
34
 
35
+ **Compile edited vindex to a runnable model:**
36
+ ```bash
37
+ # After applying patches, export to safetensors for standard inference
38
+ larql compile into model \
39
+ --vindex Divinci-AI/gemma-4-e2b-vindex \
40
+ --output ./edited-gemma4 \
41
+ --format safetensors
42
+
43
+ # Run with standard Transformers
44
+ from transformers import AutoModelForCausalLM, AutoTokenizer
45
+ model = AutoModelForCausalLM.from_pretrained('./edited-gemma4')
46
+ ```
47
+
48
  ## Quick start
49
 
50
  ```bash