File size: 2,830 Bytes

3d788f7
a0488db
 
 
 
 
 
 
 
 
8a1605c
 
3d788f7
 
c11cbab
3d788f7
ed1db83
 
 
 
 
 
8a1605c
c11cbab
ed1db83
3d788f7
ed1db83
3d788f7
ed1db83
 
3d788f7
ed1db83
 
 
3d788f7
ed1db83
3d788f7
c11cbab
3d788f7
ed1db83
c11cbab
ed1db83
 
 
 
 
 
 
3d788f7
a0488db
 
 
 
3d788f7
a0488db
 
 
 
 
 
3d788f7
c11cbab
a0488db
 
3d788f7
c11cbab
3d788f7
a0488db
 
 
 
 
ed1db83
 
 
3d788f7
a0488db
3d788f7
ed1db83
a0488db

---
base_model: google/gemma-4-31B-it
library_name: peft
license: apache-2.0
tags:
  - activation-oracles
  - interpretability
  - lora
  - self-introspection
  - sae
  - deprecated
  - legacy
---

# Legacy Activation Oracle: gemma-4-31B-it

> **Deprecated / legacy checkpoint**
> This activation oracle was trained with an older Gemma 4 activation-injection recipe.
> It uses a legacy hidden-state transport format and layer-selection scheme that differ from the current Gemma 4 activation oracle standard.
>
> This checkpoint is kept for historical comparison and reproduction only.
> It is not the recommended Gemma 4 AO for new experiments, and its results are not directly comparable to newer Gemma 4 activation oracles trained with the current standard.

This is a legacy LoRA adapter for [gemma-4-31B-it](https://huggingface.co/google/gemma-4-31B-it).
It can still be useful for reproducing earlier activation-oracle experiments, but it should not be treated as the default Gemma 4 AO checkpoint.

## Why This Checkpoint Is Legacy

This model was trained before the current Gemma 4 AO injection convention was adopted.
In practice, that means:

- it uses an older activation transport / injection recipe
- it uses an older layer-selection convention
- it should be treated as a historical artifact rather than the default Gemma 4 AO

Classification-style evaluations may still look reasonable, but that does not make this checkpoint the right choice for current Gemma 4 AO work.

## When To Use It

Use this checkpoint only if you specifically want to:

- reproduce earlier Gemma 4 AO results
- compare older and newer AO training conventions
- inspect how the legacy recipe behaves

For new Gemma 4 AO experiments, use a checkpoint trained with the current standard instead.

## Quick Start

```python
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-4-31B-it",
    torch_dtype=torch.bfloat16,
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-31B-it")

model = PeftModel.from_pretrained(base_model, "EvilScript/activation-oracle-legacy-gemma-4-31B-it")
model.eval()
```

## Legacy Training Details

| Parameter | Value |
|-----------|-------|
| **Base model** | `google/gemma-4-31B-it` |
| **Adapter** | LoRA |
| **Training tasks** | LatentQA, classification, PastLens (next-token), SAE features |
| **Checkpoint status** | Legacy / deprecated |
| **Activation injection** | Older Gemma 4 AO recipe |
| **Recommended use** | Historical comparison and reproduction only |

## Related Resources

- **Paper**: [Activation Oracles (arXiv:2512.15674)](https://arxiv.org/abs/2512.15674)
- **Code**: [activation_oracles](https://github.com/adamkarvonen/activation_oracles)