working / diffusers /docs /source /en /api /models /autoencoderkl_cogvideox.md
NadaGh's picture
End of training
dde5d93 verified

AutoencoderKLCogVideoX

The 3D variational autoencoder (VAE) model with KL loss used in CogVideoX was introduced in CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer by Tsinghua University & ZhipuAI.

The model can be loaded with the following code snippet.

from diffusers import AutoencoderKLCogVideoX

vae = AutoencoderKLCogVideoX.from_pretrained("THUDM/CogVideoX-2b", subfolder="vae", torch_dtype=torch.float16).to("cuda")

AutoencoderKLCogVideoX

[[autodoc]] AutoencoderKLCogVideoX - decode - encode - all

AutoencoderKLOutput

[[autodoc]] models.autoencoders.autoencoder_kl.AutoencoderKLOutput

DecoderOutput

[[autodoc]] models.autoencoders.vae.DecoderOutput