| --- |
| base_model: Qwen/Qwen3.6-27B |
| library_name: peft |
| pipeline_tag: text-generation |
| license: mit |
| tags: |
| - activation-oracles |
| - interpretability |
| - lora |
| - peft |
| - self-introspection |
| --- |
| |
| # Activation Oracle for Qwen3.6-27B |
|
|
| This is a PEFT LoRA adapter for `Qwen/Qwen3.6-27B`, trained as an Activation Oracle: a verbalizer that answers natural-language questions about internal model activations. |
|
|
| The adapter is intended for use with the Activation Oracles codebase and demo workflow, where target-model activations are injected into the verbalizer via activation steering hooks. |
|
|
| ## Details |
|
|
| - Base model: `Qwen/Qwen3.6-27B` |
| - Adapter type: LoRA |
| - LoRA rank: 64 |
| - LoRA alpha: 128 |
| - LoRA dropout: 0.05 |
| - Training mixture: LatentQA, binary classification tasks, and Past Lens/self-supervised context prediction |
| - Activation layers: 25%, 50%, and 75% depth of the target model |
| - Hook layer: 1 |
|
|
| ## Usage |
|
|
| See the project repository for end-to-end inference code: |
|
|
| - GitHub: https://github.com/federicotorrielli/activation_oracles_qwen36 |
|
|
| Basic adapter loading: |
|
|
| ```python |
| from peft import PeftModel |
| from transformers import AutoModelForCausalLM, AutoTokenizer |
| |
| base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.6-27B") |
| tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3.6-27B") |
| model = PeftModel.from_pretrained(base_model, "EvilScript/activation-oracle-qwen3.6-27B") |
| ``` |
|
|
| Loading the adapter alone does not perform activation-oracle inference; the activation collection and steering-hook path is implemented in the repository. |
|
|