Improved abliteration method

#1
by lunahr - opened

To abliterate reliably on Kaggle's platform you can use this notebook: https://www.kaggle.com/code/piotr25691/universal-abliteration-baukit

Works:

  • New models (Gemma 3, uncensored completely)
  • Phi series (partially uncensored due to Microsoft censorship being stronger)

Likely works:

  • Llama series
  • Gemma 2 and older
  • Phi 3.5 and older

May work:

  • Mistral series
  • Other models

It will not work with multimodal image/text models.

lunahr pinned discussion

Hi, thanks for your great work, do we consider to make 12b version?

Likely not possible because it's multimodal

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment