EvilScript commited on
Commit
4be32dd
·
verified ·
1 Parent(s): fd5175b

Rewrite legacy model card for public users

Browse files
Files changed (1) hide show
  1. README.md +26 -22
README.md CHANGED
@@ -14,31 +14,38 @@ tags:
14
 
15
  # Legacy Activation Oracle: gemma-4-26B-A4B-it
16
 
17
- > **Deprecated Gemma 4 checkpoint**
18
- > This adapter was trained with the older generic `nl_probes/sft.py` path, not the architecture-aware `nl_probes/gemma4_sft.py` path now used for Gemma 4.
19
- > It does **not** follow the current Gemma 4 injection standard and should not be used for new experiments or for the `probabilistic_activation_oracles` taboo pipeline.
 
 
 
20
 
21
  This is a legacy LoRA adapter for [gemma-4-26B-A4B-it](https://huggingface.co/google/gemma-4-26B-A4B-it).
22
- It is kept for historical comparison only.
23
 
24
- ## Why This Repo Is Legacy
25
 
26
- This adapter predates the Gemma-4-specific training path added in this repo.
27
- The main incompatibilities are:
28
 
29
- - **Legacy training entrypoint**: it was trained with `nl_probes/sft.py`, while current Gemma 4 oracles are trained with `nl_probes/gemma4_sft.py`.
30
- - **Wrong oracle-side injection layer for the current standard**: this adapter used `hook_onto_layer=1`, while the current Gemma 4 recipe injects at the first full-attention layer, which is layer `5` for this base model.
31
- - **Legacy read-layer mapping**: this adapter used the generic `25/50/75%` depth mapping from the old trainer, while the current Gemma 4 path snaps those reads to real full-attention layers.
32
- - **Validation gap**: this legacy recipe produced reasonable classification-style eval curves, but this repo explicitly notes that it did **not** establish correctness for the taboo extraction pipeline in `probabilistic_activation_oracles`.
33
 
34
- Because the adapter was trained on a different steering / readout distribution than the new Gemma 4 standard, it is not the right checkpoint format for current Gemma 4 oracle work.
35
 
36
  ## When To Use It
37
 
38
- - Use it only if you are reproducing the earlier generic Gemma 4 oracle experiments.
39
- - Do not use it as the default Gemma 4 oracle for new work.
40
 
41
- ## Quick Start (Legacy Only)
 
 
 
 
 
 
42
 
43
  ```python
44
  from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -62,15 +69,12 @@ model.eval()
62
  |-----------|-------|
63
  | **Base model** | `google/gemma-4-26B-A4B-it` |
64
  | **Adapter** | LoRA |
65
- | **Training entrypoint** | `nl_probes/sft.py` |
66
  | **Training tasks** | LatentQA, classification, PastLens (next-token), SAE features |
67
- | **Activation injection** | Legacy generic steering setup |
68
- | **Oracle hook layer** | `1` |
69
- | **Read-layer selection** | Generic `25/50/75%` depth mapping |
70
- | **Current Gemma 4 standard** | `nl_probes/gemma4_sft.py` with first-full-attention injection and full-attention-aware read-layer selection |
71
 
72
  ## Related Resources
73
 
74
- - **Gemma 4 notes in this repo**: `docs/gemma4_oracle_training_notes.md`
75
- - **Internal port report**: `docs/evilscript_gemma4_report.md`
76
  - **Code**: [activation_oracles](https://github.com/adamkarvonen/activation_oracles)
 
14
 
15
  # Legacy Activation Oracle: gemma-4-26B-A4B-it
16
 
17
+ > **Deprecated / legacy checkpoint**
18
+ > This activation oracle was trained with an older Gemma 4 activation-injection recipe.
19
+ > It uses a legacy hidden-state transport format and layer-selection scheme that differ from the current Gemma 4 activation oracle standard.
20
+ >
21
+ > This checkpoint is kept for historical comparison and reproduction only.
22
+ > It is not the recommended Gemma 4 AO for new experiments, and its results are not directly comparable to newer Gemma 4 activation oracles trained with the current standard.
23
 
24
  This is a legacy LoRA adapter for [gemma-4-26B-A4B-it](https://huggingface.co/google/gemma-4-26B-A4B-it).
25
+ It can still be useful for reproducing earlier activation-oracle experiments, but it should not be treated as the default Gemma 4 AO checkpoint.
26
 
27
+ ## Why This Checkpoint Is Legacy
28
 
29
+ This model was trained before the current Gemma 4 AO injection convention was adopted.
30
+ In practice, that means:
31
 
32
+ - it uses an older activation transport / injection recipe
33
+ - it uses an older layer-selection convention
34
+ - it should be treated as a historical artifact rather than the default Gemma 4 AO
 
35
 
36
+ Classification-style evaluations may still look reasonable, but that does not make this checkpoint the right choice for current Gemma 4 AO work.
37
 
38
  ## When To Use It
39
 
40
+ Use this checkpoint only if you specifically want to:
 
41
 
42
+ - reproduce earlier Gemma 4 AO results
43
+ - compare older and newer AO training conventions
44
+ - inspect how the legacy recipe behaves
45
+
46
+ For new Gemma 4 AO experiments, use a checkpoint trained with the current standard instead.
47
+
48
+ ## Quick Start
49
 
50
  ```python
51
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
69
  |-----------|-------|
70
  | **Base model** | `google/gemma-4-26B-A4B-it` |
71
  | **Adapter** | LoRA |
 
72
  | **Training tasks** | LatentQA, classification, PastLens (next-token), SAE features |
73
+ | **Checkpoint status** | Legacy / deprecated |
74
+ | **Activation injection** | Older Gemma 4 AO recipe |
75
+ | **Recommended use** | Historical comparison and reproduction only |
 
76
 
77
  ## Related Resources
78
 
79
+ - **Paper**: [Activation Oracles (arXiv:2512.15674)](https://arxiv.org/abs/2512.15674)
 
80
  - **Code**: [activation_oracles](https://github.com/adamkarvonen/activation_oracles)