Spaces:
Runtime error
Runtime error
Delete README.md
Browse files
README.md
DELETED
@@ -1,168 +0,0 @@
|
|
1 |
-
# MotionClone
|
2 |
-
This repository is the official implementation of [MotionClone](https://arxiv.org/abs/2406.05338). It is a **training-free framework** that enables motion cloning from a reference video for controllable video generation, **without cumbersome video inversion processes**.
|
3 |
-
<details><summary>Click for the full abstract of MotionClone</summary>
|
4 |
-
|
5 |
-
> Motion-based controllable video generation offers the potential for creating captivating visual content. Existing methods typically necessitate model training to encode particular motion cues or incorporate fine-tuning to inject certain motion patterns, resulting in limited flexibility and generalization.
|
6 |
-
In this work, we propose **MotionClone** a training-free framework that enables motion cloning from reference videos to versatile motion-controlled video generation, including text-to-video and image-to-video. Based on the observation that the dominant components in temporal-attention maps drive motion synthesis, while the rest mainly capture noisy or very subtle motions, MotionClone utilizes sparse temporal attention weights as motion representations for motion guidance, facilitating diverse motion transfer across varying scenarios. Meanwhile, MotionClone allows for the direct extraction of motion representation through a single denoising step, bypassing the cumbersome inversion processes and thus promoting both efficiency and flexibility.
|
7 |
-
Extensive experiments demonstrate that MotionClone exhibits proficiency in both global camera motion and local object motion, with notable superiority in terms of motion fidelity, textual alignment, and temporal consistency.
|
8 |
-
</details>
|
9 |
-
|
10 |
-
**[MotionClone: Training-Free Motion Cloning for Controllable Video Generation](https://arxiv.org/abs/2406.05338)**
|
11 |
-
</br>
|
12 |
-
[Pengyang Ling*](https://github.com/LPengYang/),
|
13 |
-
[Jiazi Bu*](https://github.com/Bujiazi/),
|
14 |
-
[Pan Zhang<sup>†</sup>](https://panzhang0212.github.io/),
|
15 |
-
[Xiaoyi Dong](https://scholar.google.com/citations?user=FscToE0AAAAJ&hl=en/),
|
16 |
-
[Yuhang Zang](https://yuhangzang.github.io/),
|
17 |
-
[Tong Wu](https://wutong16.github.io/),
|
18 |
-
[Huaian Chen](https://scholar.google.com.hk/citations?hl=zh-CN&user=D6ol9XkAAAAJ),
|
19 |
-
[Jiaqi Wang](https://myownskyw7.github.io/),
|
20 |
-
[Yi Jin<sup>†</sup>](https://scholar.google.ca/citations?hl=en&user=mAJ1dCYAAAAJ)
|
21 |
-
(*Equal Contribution)(<sup>†</sup>Corresponding Author)
|
22 |
-
|
23 |
-
<!-- [Arxiv Report](https://arxiv.org/abs/2307.04725) | [Project Page](https://animatediff.github.io/) -->
|
24 |
-
[](https://arxiv.org/abs/2406.05338)
|
25 |
-
[](https://bujiazi.github.io/motionclone.github.io/)
|
26 |
-

|
27 |
-
<!-- [](https://bujiazi.github.io/motionclone.github.io/) -->
|
28 |
-
<!-- [](https://bujiazi.github.io/motionclone.github.io/) -->
|
29 |
-
|
30 |
-
## Demo
|
31 |
-
[![]](https://github.com/user-attachments/assets/d1f1c753-f192-455b-9779-94c925e51aaa)
|
32 |
-
|
33 |
-
```bash
|
34 |
-
sudo apt-get update && sudo apt-get install git-lfs ffmpeg cbm
|
35 |
-
|
36 |
-
conda create --name py310 python=3.10
|
37 |
-
conda activate py310
|
38 |
-
pip install ipykernel
|
39 |
-
python -m ipykernel install --user --name py310 --display-name "py310"
|
40 |
-
|
41 |
-
git clone https://github.com/svjack/MotionClone && cd MotionClone
|
42 |
-
pip install -r requirements.txt
|
43 |
-
|
44 |
-
mkdir -p models
|
45 |
-
git clone https://huggingface.co/stable-diffusion-v1-5/stable-diffusion-v1-5 models/StableDiffusion/
|
46 |
-
|
47 |
-
mkdir -p models/DreamBooth_LoRA
|
48 |
-
wget https://huggingface.co/svjack/Realistic-Vision-V6.0-B1/resolve/main/realisticVisionV60B1_v51VAE.safetensors -O models/DreamBooth_LoRA/realisticVisionV60B1_v51VAE.safetensors
|
49 |
-
|
50 |
-
mkdir -p models/Motion_Module
|
51 |
-
wget https://huggingface.co/guoyww/animatediff/resolve/main/v3_sd15_mm.ckpt -O models/Motion_Module/v3_sd15_mm.ckpt
|
52 |
-
wget https://huggingface.co/guoyww/animatediff/resolve/main/v3_sd15_adapter.ckpt -O models/Motion_Module/v3_sd15_adapter.ckpt
|
53 |
-
|
54 |
-
mkdir -p models/SparseCtrl
|
55 |
-
wget https://huggingface.co/guoyww/animatediff/resolve/main/v3_sd15_sparsectrl_rgb.ckpt -O models/SparseCtrl/v3_sd15_sparsectrl_rgb.ckpt
|
56 |
-
wget https://huggingface.co/guoyww/animatediff/resolve/main/v3_sd15_sparsectrl_scribble.ckpt -O models/SparseCtrl/v3_sd15_sparsectrl_scribble.ckpt
|
57 |
-
```
|
58 |
-
|
59 |
-
## 🖋 News
|
60 |
-
- The latest version of our paper (**v4**) is available on arXiv! (10.08)
|
61 |
-
- The latest version of our paper (**v3**) is available on arXiv! (7.2)
|
62 |
-
- Code released! (6.29)
|
63 |
-
|
64 |
-
## 🏗️ Todo
|
65 |
-
- [x] We have updated the latest version of MotionCloning, which performs motion transfer **without video inversion** and supports **image-to-video and sketch-to-video**.
|
66 |
-
- [x] Release the MotionClone code (We have released **the first version** of our code and will continue to optimize it. We welcome any questions or issues you may have and will address them promptly.)
|
67 |
-
- [x] Release paper
|
68 |
-
|
69 |
-
## 📚 Gallery
|
70 |
-
We show more results in the [Project Page](https://bujiazi.github.io/motionclone.github.io/).
|
71 |
-
|
72 |
-
## 🚀 Method Overview
|
73 |
-
### Feature visualization
|
74 |
-
<div align="center">
|
75 |
-
<img src='__assets__/feature_visualization.png'/>
|
76 |
-
</div>
|
77 |
-
|
78 |
-
### Pipeline
|
79 |
-
<div align="center">
|
80 |
-
<img src='__assets__/pipeline.png'/>
|
81 |
-
</div>
|
82 |
-
|
83 |
-
MotionClone utilizes sparse temporal attention weights as motion representations for motion guidance, facilitating diverse motion transfer across varying scenarios. Meanwhile, MotionClone allows for the direct extraction of motion representation through a single denoising step, bypassing the cumbersome inversion processes and thus promoting both efficiency and flexibility.
|
84 |
-
|
85 |
-
## 🔧 Installations (python==3.11.3 recommended)
|
86 |
-
|
87 |
-
### Setup repository and conda environment
|
88 |
-
|
89 |
-
```
|
90 |
-
git clone https://github.com/Bujiazi/MotionClone.git
|
91 |
-
cd MotionClone
|
92 |
-
|
93 |
-
conda env create -f environment.yaml
|
94 |
-
conda activate motionclone
|
95 |
-
```
|
96 |
-
|
97 |
-
## 🔑 Pretrained Model Preparations
|
98 |
-
|
99 |
-
### Download Stable Diffusion V1.5
|
100 |
-
|
101 |
-
```
|
102 |
-
git lfs install
|
103 |
-
git clone https://huggingface.co/runwayml/stable-diffusion-v1-5 models/StableDiffusion/
|
104 |
-
```
|
105 |
-
|
106 |
-
After downloading Stable Diffusion, save them to `models/StableDiffusion`.
|
107 |
-
|
108 |
-
### Prepare Community Models
|
109 |
-
|
110 |
-
Manually download the community `.safetensors` models from [RealisticVision V5.1](https://civitai.com/models/4201?modelVersionId=130072) and save them to `models/DreamBooth_LoRA`.
|
111 |
-
|
112 |
-
### Prepare AnimateDiff Motion Modules
|
113 |
-
|
114 |
-
Manually download the AnimateDiff modules from [AnimateDiff](https://github.com/guoyww/AnimateDiff), we recommend [`v3_adapter_sd_v15.ckpt`](https://huggingface.co/guoyww/animatediff/blob/main/v3_sd15_adapter.ckpt) and [`v3_sd15_mm.ckpt.ckpt`](https://huggingface.co/guoyww/animatediff/blob/main/v3_sd15_mm.ckpt). Save the modules to `models/Motion_Module`.
|
115 |
-
|
116 |
-
### Prepare SparseCtrl for image-to-video and sketch-to-video
|
117 |
-
Manually download "v3_sd15_sparsectrl_rgb.ckpt" and "v3_sd15_sparsectrl_scribble.ckpt" from [AnimateDiff](https://huggingface.co/guoyww/animatediff/tree/main). Save the modules to `models/SparseCtrl`.
|
118 |
-
|
119 |
-
## 🎈 Quick Start
|
120 |
-
|
121 |
-
### Perform Text-to-video generation with customized camera motion
|
122 |
-
```
|
123 |
-
python t2v_video_sample.py --inference_config "configs/t2v_camera.yaml" --examples "configs/t2v_camera.jsonl"
|
124 |
-
```
|
125 |
-
|
126 |
-
https://github.com/user-attachments/assets/2656a49a-c57d-4f89-bc65-5ec09ac037ea
|
127 |
-
|
128 |
-
|
129 |
-
|
130 |
-
|
131 |
-
|
132 |
-
### Perform Text-to-video generation with customized object motion
|
133 |
-
```
|
134 |
-
python t2v_video_sample.py --inference_config "configs/t2v_object.yaml" --examples "configs/t2v_object.jsonl"
|
135 |
-
```
|
136 |
-
### Combine motion cloning with sketch-to-video
|
137 |
-
```
|
138 |
-
python i2v_video_sample.py --inference_config "configs/i2v_sketch.yaml" --examples "configs/i2v_sketch.jsonl"
|
139 |
-
```
|
140 |
-
### Combine motion cloning with image-to-video
|
141 |
-
```
|
142 |
-
python i2v_video_sample.py --inference_config "configs/i2v_rgb.yaml" --examples "configs/i2v_rgb.jsonl"
|
143 |
-
```
|
144 |
-
|
145 |
-
|
146 |
-
## 📎 Citation
|
147 |
-
|
148 |
-
If you find this work helpful, please cite the following paper:
|
149 |
-
|
150 |
-
```
|
151 |
-
@article{ling2024motionclone,
|
152 |
-
title={MotionClone: Training-Free Motion Cloning for Controllable Video Generation},
|
153 |
-
author={Ling, Pengyang and Bu, Jiazi and Zhang, Pan and Dong, Xiaoyi and Zang, Yuhang and Wu, Tong and Chen, Huaian and Wang, Jiaqi and Jin, Yi},
|
154 |
-
journal={arXiv preprint arXiv:2406.05338},
|
155 |
-
year={2024}
|
156 |
-
}
|
157 |
-
```
|
158 |
-
|
159 |
-
## 📣 Disclaimer
|
160 |
-
|
161 |
-
This is official code of MotionClone.
|
162 |
-
All the copyrights of the demo images and audio are from community users.
|
163 |
-
Feel free to contact us if you would like remove them.
|
164 |
-
|
165 |
-
## 💞 Acknowledgements
|
166 |
-
The code is built upon the below repositories, we thank all the contributors for open-sourcing.
|
167 |
-
* [AnimateDiff](https://github.com/guoyww/AnimateDiff)
|
168 |
-
* [FreeControl](https://github.com/genforce/freecontrol)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|