SynthSAEBench: Evaluating Sparse Autoencoders on Scalable Realistic Synthetic Data
Paper • 2602.14687 • Published
Sample Sparse Autoencoders (SAEs) trained on the SynthSAEBench-16k-v1 model. Training code is at https://github.com/decoderesearch/synth-sae-bench-experiments.
We train 5 different SAE types, each with 5 seeds and L0 from 15-45. Each SAE has a stats.json file containing eval stats.
Check out SAELens to train your own SAEs on SynthSAEBench and make your own custom synthetic data models. Also see the SynthSAEBench paper for more details.