Instructions to use decoderesearch/sae-snapshot-panels with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- SAELens
How to use decoderesearch/sae-snapshot-panels with SAELens:
# pip install sae-lens from sae_lens import SAE sae, cfg_dict, sparsity = SAE.from_pretrained( release = "RELEASE_ID", # e.g., "gpt2-small-res-jb". See other options in https://github.com/jbloomAus/SAELens/blob/main/sae_lens/pretrained_saes.yaml sae_id = "SAE_ID", # e.g., "blocks.8.hook_resid_pre". Won't always be a hook point ) - Notebooks
- Google Colab
- Kaggle
| library_name: saelens | |
| SAE panels and SAEBench results from the paper "[Are Sparse Autoencoder Benchmarks Reliable?](https://arxiv.org/abs/2605.18229)" | |
| This repo is split into 2 panels, a cross-architecture panel consisting of 4 SAEs (K=50 Matryoska, k=100 Matryoshka, k=50 BatchTopK, k=100 BatchTopK), | |
| and a Matryoshka panel consisting of 4 Matryoshka SAEs verying the number of Matryoshka prefixes from 1 to 4 (n-1, n-2, n-3, n-4). Each SAE in the Matryoshka panel | |
| is trained 3 times with different seeds (so 12 SAEs total). The cross-architecture panel is trained for 1.5B tokens, while the Matryoshka panel is trained for 300M tokens. | |
| Within each SAE dir, there are a number of snapshots of the SAE taken throughout training. Each of these snapshot dirs include the following: | |
| - SAE weights (`sae_weights.safetensors`) and `cfg.json` for loading with SAELens | |
| - SAEBench raw result JSON files for all SAEBench metrics | |
| To load an SAE snapshot using SAELens, run the following: | |
| ```python | |
| from sae_lens import SAE | |
| sae = SAE.from_pretrained("decoderesearch/sae-snapshot-panels", "path/to/snapshot") | |
| ``` | |
| For instance, to load the SAE snapshot for the K=100 BatchTopK SAE after 500M tokens of training, you would run: | |
| ```python | |
| sae = SAE.from_pretrained( | |
| "decoderesearch/sae-snapshot-panels", | |
| "cross-arch-panel/gemma-2-2b/batchtopk/k-100/seed-0/snapshots/step-122070-tokens-500000000", | |
| ) | |
| ``` | |
| ## Citation | |
| If you use these SAEs in your work, please cite the following: | |
| ```bibtex | |
| @misc{chanin2026saebenchmarks, | |
| title={Are Sparse Autoencoder Benchmarks Reliable?}, | |
| author={David Chanin}, | |
| year={2026}, | |
| eprint={2605.18229}, | |
| archivePrefix={arXiv}, | |
| primaryClass={cs.LG}, | |
| url={https://arxiv.org/abs/2605.18229}, | |
| } | |
| ``` |