---
license: mit
library_name: pytorch
tags:
  - sparse-autoencoder
  - interpretability
  - mechanistic-interpretability
  - gated-deltanet
  - mamba
  - rwkv
  - linear-attention
  - state-space-model
base_model:
  - Qwen/Qwen3.5-0.8B
  - Qwen/Qwen3.5-4B
  - Qwen/Qwen3.5-27B
language:
  - en
pipeline_tag: feature-extraction
---

# WriteSAE

**WriteSAE: Sparse Autoencoders for Recurrent State**

Jack Young

[Paper](https://arxiv.org/abs/2605.12770) | [Website](https://www.jackyoung.io/research/writesae) | [Code](https://github.com/JackYoung27/writesae)

WriteSAE factors each decoder atom as the rank-1 outer product **vᵢwᵢᵀ**, matching the native **kₜvₜᵀ** write that Gated DeltaNet, Mamba-2, and RWKV-7 install into a **dₖ × dᵥ** matrix cache. Residual SAEs cannot reach that write site; WriteSAE can. Atom substitution beats matched-Frobenius-norm ablation on **92.4%** of *n*=4,851 firings at Qwen3.5-0.8B L9 H4, the closed form predicts measured logit shifts at **R² = 0.98**, and sustained three-position installs lift midrank target-in-continuation from 33.3% to **100%** under greedy decoding. Cross-architecture: GDN rank-1 atoms transfer to Mamba-2-370M at 88.1% over 2,500 firings, with sharpness ordering GDN > RWKV-7 > Mamba-2.

## Quick start

```python
from huggingface_hub import snapshot_download
import torch

ckpt_dir = snapshot_download(
    "JackYoung27/writesae-ckpts",
    allow_patterns=["writesae/qwen0p8b/L9_H4/*"],
)

ckpt = torch.load(
    f"{ckpt_dir}/writesae/qwen0p8b/L9_H4/best.pt",
    weights_only=False,
    map_location="cpu",
)

# Decoder atom 412 — the paper's ERASE example.
v_412 = ckpt["sae"].decoder.v[412]   # (d_k,)
w_412 = ckpt["sae"].decoder.w[412]   # (d_v,)
atom = torch.outer(v_412, w_412)      # (d_k, d_v)
```

Standalone runnable in [`LOAD_EXAMPLE.py`](LOAD_EXAMPLE.py).

## Variants

| variant | encoder | decoder | role |
|---|---|---|---|
| **WriteSAE** | bilinear vᵢᵀ S wᵢ | rank-1 vᵢwᵢᵀ | All headline numbers |
| FlatSAE | linear on vec(S) | flat | Architectural-prior comparison |
| MatrixSAE | linear on vec(S) | full-rank | Ablation |
| BilinearSAE | bilinear | bilinear | Ablation |

## Base models covered

Qwen3.5-0.8B (primary), Qwen3.5-4B, Qwen3.5-27B, Mamba-2-370M, RWKV-7-1.5B, DeltaNet-1.3B, GLA-1.3B. See [`MODEL_CARD.md`](MODEL_CARD.md) for full layer / head coverage and training details.

## Repository layout

```text
writesae-ckpts/
  README.md
  MODEL_CARD.md
  manifest.json
  LOAD_EXAMPLE.py
  LICENSE

  writesae/<base-model>/<layer>_<head>/best.pt        # primary cells
  flat_baseline/<base-model>_<layer>_<head>/best.pt   # FlatSAE controls
  results/<test-name>/                                # JSON outputs per paper claim
```

## Limitations

The closed-form factorization predicts well only on Gated DeltaNet (R² = 0.98 at L9 H4); applied to Mamba-2 or Qwen3.5-4B, it returns negative R². The substitution test itself transfers to Mamba-2 (88.1%); the analytical coefficient does not. Per-atom identity varies across SAE seeds; the class-level register / bundle partition reproduces at CV 4–12%.

## Citation

```bibtex
@article{young2026writesae,
  title  = {WriteSAE: Sparse Autoencoders for Recurrent State},
  author = {Young, Jack},
  year   = {2026},
  journal= {arXiv preprint arXiv:2605.12770},
  url    = {https://github.com/JackYoung27/writesae}
}
```

MIT license. Base models retain their upstream licenses; no base-model weights are redistributed.