Remove paper citation from model card
Browse files
README.md
CHANGED
|
@@ -9,14 +9,13 @@ tags:
|
|
| 9 |
- lora
|
| 10 |
- peft
|
| 11 |
- self-introspection
|
| 12 |
-
- arxiv:2512.15674
|
| 13 |
---
|
| 14 |
|
| 15 |
# Activation Oracle for Qwen3.6-27B
|
| 16 |
|
| 17 |
This is a PEFT LoRA adapter for `Qwen/Qwen3.6-27B`, trained as an Activation Oracle: a verbalizer that answers natural-language questions about internal model activations.
|
| 18 |
|
| 19 |
-
The adapter is intended for use with the
|
| 20 |
|
| 21 |
## Details
|
| 22 |
|
|
@@ -34,7 +33,6 @@ The adapter is intended for use with the [Activation Oracles](https://arxiv.org/
|
|
| 34 |
See the project repository for end-to-end inference code:
|
| 35 |
|
| 36 |
- GitHub: https://github.com/federicotorrielli/activation_oracles_qwen36
|
| 37 |
-
- Paper: https://arxiv.org/abs/2512.15674
|
| 38 |
|
| 39 |
Basic adapter loading:
|
| 40 |
|
|
@@ -48,17 +46,3 @@ model = PeftModel.from_pretrained(base_model, "EvilScript/activation-oracle-qwen
|
|
| 48 |
```
|
| 49 |
|
| 50 |
Loading the adapter alone does not perform activation-oracle inference; the activation collection and steering-hook path is implemented in the repository.
|
| 51 |
-
|
| 52 |
-
## Citation
|
| 53 |
-
|
| 54 |
-
```bibtex
|
| 55 |
-
@misc{karvonen2025activationoraclestrainingevaluating,
|
| 56 |
-
title={Activation Oracles: Training and Evaluating LLMs as General-Purpose Activation Explainers},
|
| 57 |
-
author={Adam Karvonen and James Chua and Clément Dumas and Kit Fraser-Taliente and Subhash Kantamneni and Julian Minder and Euan Ong and Arnab Sen Sharma and Daniel Wen and Owain Evans and Samuel Marks},
|
| 58 |
-
year={2025},
|
| 59 |
-
eprint={2512.15674},
|
| 60 |
-
archivePrefix={arXiv},
|
| 61 |
-
primaryClass={cs.CL},
|
| 62 |
-
url={https://arxiv.org/abs/2512.15674}
|
| 63 |
-
}
|
| 64 |
-
```
|
|
|
|
| 9 |
- lora
|
| 10 |
- peft
|
| 11 |
- self-introspection
|
|
|
|
| 12 |
---
|
| 13 |
|
| 14 |
# Activation Oracle for Qwen3.6-27B
|
| 15 |
|
| 16 |
This is a PEFT LoRA adapter for `Qwen/Qwen3.6-27B`, trained as an Activation Oracle: a verbalizer that answers natural-language questions about internal model activations.
|
| 17 |
|
| 18 |
+
The adapter is intended for use with the Activation Oracles codebase and demo workflow, where target-model activations are injected into the verbalizer via activation steering hooks.
|
| 19 |
|
| 20 |
## Details
|
| 21 |
|
|
|
|
| 33 |
See the project repository for end-to-end inference code:
|
| 34 |
|
| 35 |
- GitHub: https://github.com/federicotorrielli/activation_oracles_qwen36
|
|
|
|
| 36 |
|
| 37 |
Basic adapter loading:
|
| 38 |
|
|
|
|
| 46 |
```
|
| 47 |
|
| 48 |
Loading the adapter alone does not perform activation-oracle inference; the activation collection and steering-hook path is implemented in the repository.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|