SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder
Paper β’ 2510.05081 β’ Published β’ 5
Repo: Ronenk94/T5_matryoshka_sae
Model Type: Sparse Autoencoder over T5 Embeddings
Paper: SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder
License: CC BY 4.0
This repository contains the Top-K 300 Sparse Autoencoder (SAE) used in the SAEdit framework.
It is trained on T5 text embeddings and designed to produce sparse latent representations that enable token-level semantic control in image editing pipelines.
| Property | Details |
|---|---|
| Architecture | GlobalBatchTopKMatryoshkaSAE |
| Latent sparsity | Top-K = 300 activations |
| Backbone embeddings | Frozen T5 encoder |
| Task | Semantic factorization + reconstruction |
| Use case | Editing directions for diffusion-based image manipulation |
import torch
from src.models.sparse_autoencoders.matryoshka_sae import GlobalBatchTopKMatryoshkaSAE
# Option A β using a from_pretrained method (if implemented)
model = GlobalBatchTopKMatryoshkaSAE.from_pretrained(
"Ronenk94/T5_matryoshka_sae",
device="cuda"
)
If you use this model for your research, please cite the following work:
@misc{kamenetsky2025saedittokenlevelcontrolcontinuous,
title={SAEdit: Token-level control for continuous image editing via Sparse AutoEncoder},
author={Ronen Kamenetsky and Sara Dorfman and Daniel Garibi and Roni Paiss and Or Patashnik and Daniel Cohen-Or},
year={2025},
eprint={2510.05081},
archivePrefix={arXiv},
primaryClass={cs.GR},
url={https://arxiv.org/abs/2510.05081},
}
Model was trained on T5 encoder of the Flux-Dev variant. Other models might use different checkpoint
This model is released under Creative Commons Attribution 4.0 (CC BY 4.0), consistent with the associated SAEdit paper.