File size: 1,632 Bytes
167a3e0 6741591 167a3e0 6741591 167a3e0 6741591 167a3e0 6741591 167a3e0 6741591 167a3e0 6741591 167a3e0 6741591 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 | ---
license: apache-2.0
tags:
- sparse-autoencoder
- mechanistic-interpretability
- tool-calling
- gemma
- ministral
- qwen
arxiv: 2605.18882
---
# toolcalling-sae
TopK Sparse Autoencoder checkpoints from [To Call or Not to Call: Diagnosing Intrinsic Over-Calling Bias in LLM Agents](https://arxiv.org/abs/2605.18882).
## Checkpoints
| Model | Layer | Dict Size | k | Stage 1 | Stage 2 |
|-------|-------|-----------|---|---------|---------|
| gemma-3-1b-it | L17 | 9 216 | 128 | 50M tokens | 5M tokens |
| gemma-3-4b-it | L29 | 20 480 | 128 | 50M tokens | 5M tokens |
| gemma-4-E2B-it | L30 | 12 288 | 128 | 50M tokens | 5M tokens |
| gemma-4-E4B-it | L30 | 20 480 | 128 | 50M tokens | 5M tokens |
| Ministral-3-3B-Instruct-2512 | L21 | 24 576 | 128 | 50M tokens | 5M tokens |
| Ministral-3-8B-Instruct-2512 | L31 | 32 768 | 128 | 50M tokens | 5M tokens |
| Qwen3.5-4B | L25 | 20 480 | 128 | 50M tokens | 5M tokens |
| Qwen3.5-9B | L25 | 32 768 | 128 | 50M tokens | 5M tokens |
**Stage 1**: Pre-trained on [OpenWebText2](https://openwebtext2.readthedocs.io/).
**Stage 2**: Fine-tuned on tool-calling activations from the [When2Call](https://arxiv.org/abs/2605.18882) benchmark.
All checkpoints use `bfloat16` precision.
## Usage
```python
from huggingface_hub import hf_hub_download
from sae_model import TopKSAE
ckpt_path = hf_hub_download(
repo_id="SKwra/toolcalling-sae",
filename="gemma-3-1b-it/stage2/gemma-3-1b-it-L17-d9216-5M-stage2.pt"
)
sae = TopKSAE.load(ckpt_path, device="cuda")
```
`sae_model.py` is included in this repo. Full code at [GitHub](https://github.com/SKURA502/agent-sae).
|