EvilScript
/

activation-oracle-legacy-gemma-4-31B-it

activation-oracles

interpretability

self-introspection

Model card Files Files and versions

activation-oracle-legacy-gemma-4-31B-it / README.md

EvilScript's picture

Rewrite legacy model card for public users

ed1db83 verified 16 days ago

|

history blame contribute delete

2.83 kB

	---
	base_model: google/gemma-4-31B-it
	library_name: peft
	license: apache-2.0
	tags:
	- activation-oracles
	- interpretability
	- lora
	- self-introspection
	- sae
	- deprecated
	- legacy
	---

	# Legacy Activation Oracle: gemma-4-31B-it

	> Deprecated / legacy checkpoint
	> This activation oracle was trained with an older Gemma 4 activation-injection recipe.
	> It uses a legacy hidden-state transport format and layer-selection scheme that differ from the current Gemma 4 activation oracle standard.
	>
	> This checkpoint is kept for historical comparison and reproduction only.
	> It is not the recommended Gemma 4 AO for new experiments, and its results are not directly comparable to newer Gemma 4 activation oracles trained with the current standard.

	This is a legacy LoRA adapter for [gemma-4-31B-it](https://huggingface.co/google/gemma-4-31B-it).
	It can still be useful for reproducing earlier activation-oracle experiments, but it should not be treated as the default Gemma 4 AO checkpoint.

	## Why This Checkpoint Is Legacy

	This model was trained before the current Gemma 4 AO injection convention was adopted.
	In practice, that means:

	- it uses an older activation transport / injection recipe
	- it uses an older layer-selection convention
	- it should be treated as a historical artifact rather than the default Gemma 4 AO

	Classification-style evaluations may still look reasonable, but that does not make this checkpoint the right choice for current Gemma 4 AO work.

	## When To Use It

	Use this checkpoint only if you specifically want to:

	- reproduce earlier Gemma 4 AO results
	- compare older and newer AO training conventions
	- inspect how the legacy recipe behaves

	For new Gemma 4 AO experiments, use a checkpoint trained with the current standard instead.

	## Quick Start

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from peft import PeftModel
	import torch

	base_model = AutoModelForCausalLM.from_pretrained(
	"google/gemma-4-31B-it",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	)
	tokenizer = AutoTokenizer.from_pretrained("google/gemma-4-31B-it")

	model = PeftModel.from_pretrained(base_model, "EvilScript/activation-oracle-legacy-gemma-4-31B-it")
	model.eval()
	```

	## Legacy Training Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Base model \| `google/gemma-4-31B-it` \|
	\| Adapter \| LoRA \|
	\| Training tasks \| LatentQA, classification, PastLens (next-token), SAE features \|
	\| Checkpoint status \| Legacy / deprecated \|
	\| Activation injection \| Older Gemma 4 AO recipe \|
	\| Recommended use \| Historical comparison and reproduction only \|

	## Related Resources

	- Paper: [Activation Oracles (arXiv:2512.15674)](https://arxiv.org/abs/2512.15674)
	- Code: [activation_oracles](https://github.com/adamkarvonen/activation_oracles)