|
---
|
|
license: apache-2.0
|
|
language:
|
|
- en
|
|
tags:
|
|
- video
|
|
- genmo
|
|
- diffusers
|
|
pipeline_tag: text-to-video
|
|
library_name: diffusers
|
|
---
|
|
# π₯ Distilled Mochi Transformer
|
|
Current repository contains distilled transformer for genmoai mochi-1.
|
|
This transformer consists of 42 blocks vs 48 blocks in original transformer.
|
|
|
|
### Training details
|
|
We have analized MSE of latent after each block and iteratively dropped blocks which have minimum value of MSE.
|
|
|
|
After each block drop we have trained neighboring blocks (one before and one after deleted block) for 1K steps.
|
|
|
|
### π Try it here: [Interactive Demo](https://nim.video/create/2855fa68-21b1-4114-b366-53e5e4705ebf?workflow=image2video)
|
|
|
|
---
|
|
|
|
|
|
## Usage
|
|
#### Minimal code example
|
|
```python
|
|
import torch
|
|
from diffusers import MochiPipeline, MochiTransformer3DModel
|
|
from diffusers.utils import export_to_video
|
|
|
|
transformer = MochiTransformer3DModel.from_pretrained(
|
|
"NimVideo/mochi-1-transformer-42",
|
|
torch_dtype=torch.bfloat16,
|
|
)
|
|
pipe = MochiPipeline.from_pretrained(
|
|
"genmo/mochi-1-preview",
|
|
transformer=transformer,
|
|
variant="bf16",
|
|
torch_dtype=torch.bfloat16
|
|
)
|
|
|
|
pipe.enable_model_cpu_offload()
|
|
pipe.enable_vae_tiling()
|
|
|
|
prompt = "Close-up of a chameleon's eye, with its scaly skin changing color. Ultra high resolution 4k."
|
|
frames = pipe(prompt, num_frames=85).frames[0]
|
|
|
|
export_to_video(frames, "mochi.mp4", fps=30)
|
|
```
|
|
|
|
|
|
## Acknowledgements
|
|
Original code and models [mochi](https://github.com/genmoai/mochi).
|
|
|
|
## Contacts
|
|
<p>Issues should be raised directly in the repository.</p>
|
|
|