SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders

Paper on arXiv | GitHub repo

The repository contains Sparse Autoencoders trained in our work for blocks up.1.1 and up.1.2.

After cloning our GitHub repo, you can use them as follows:

from SAE.sae import Sae

device = "cuda"
hookpoint = "unet.up_blocks.1.attentions.1"

sae = Sae.load_from_hub("bcywinski/SAeUron", hookpoint=hookpoint, device=device)

πŸ“š Bibtex

@article{cywinski2025saeuron,
  title={SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders},
  author={Cywi{\'n}ski, Bartosz and Deja, Kamil},
  journal={arXiv preprint arXiv:2501.18052},
  year={2025}
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.