Update README.md
Browse files
README.md
CHANGED
@@ -28,13 +28,13 @@ This repo contains offical PyTorch model definitions, pre-trained weights and in
|
|
28 |
* Mar 06, 2025: π We release the inference code and model weights of HunyuanVideo-I2V. [Download](https://github.com/Tencent/HunyuanVideo-I2V/blob/main/ckpts/README.md).
|
29 |
|
30 |
|
31 |
-
### Frist Frame Consistency Demo
|
32 |
| Reference Image | Generated Video |
|
33 |
|:----------------:|:----------------:|
|
34 |
| <img src="https://github.com/user-attachments/assets/83e7a097-ffca-40db-9c72-be01d866aa7d" width="80%"> | <video src="https://github.com/user-attachments/assets/f81d2c88-bb1a-43f8-b40f-1ccc20774563" width="100%"> </video> |
|
35 |
ο½ <img src="https://github.com/user-attachments/assets/c385a11f-60c7-4919-b0f1-bc5e715f673c" width="80%"> | <video src="https://github.com/user-attachments/assets/0c29ede9-0481-4d40-9c67-a4b6267fdc2d" width="100%"> </video> |
|
36 |
ο½ <img src="https://github.com/user-attachments/assets/5763f5eb-0be5-4b36-866a-5199e31c5802" width="95%"> | <video src="https://github.com/user-attachments/assets/a8da0a1b-ba7d-45a4-a901-5d213ceaf50e" width="100%"> </video> |
|
37 |
-
|
38 |
<!-- ### Customizable I2V LoRA Demo
|
39 |
|
40 |
| I2V Lora Effect | Reference Image | Generated Video |
|
@@ -65,7 +65,7 @@ This repo contains offical PyTorch model definitions, pre-trained weights and in
|
|
65 |
- [x] Inference
|
66 |
- [x] Checkpoints
|
67 |
- [x] ComfyUI
|
68 |
-
- [
|
69 |
- [ ] Multi-gpus Sequence Parallel inference (Faster inference speed on more gpus)
|
70 |
- [ ] Diffusers
|
71 |
- [ ] FP8 Quantified weight
|
@@ -84,6 +84,12 @@ This repo contains offical PyTorch model definitions, pre-trained weights and in
|
|
84 |
- [Tips for Using Image-to-Video Models](#tips-for-using-image-to-video-models)
|
85 |
- [Using Command Line](#using-command-line)
|
86 |
- [More Configurations](#more-configurations)
|
|
|
|
|
|
|
|
|
|
|
|
|
87 |
- [π BibTeX](#-bibtex)
|
88 |
- [Acknowledgements](#acknowledgements)
|
89 |
---
|
@@ -200,10 +206,10 @@ If you want to generate a more **stable** video, you can set `--i2v-stability` a
|
|
200 |
cd HunyuanVideo-I2V
|
201 |
|
202 |
python3 sample_image2video.py \
|
203 |
-
--prompt "An Asian man with short hair in black tactical uniform and white clothes waves a firework stick." \
|
204 |
-
--i2v-image-path ./demo/imgs/0.jpg \
|
205 |
--model HYVideo-T/2 \
|
|
|
206 |
--i2v-mode \
|
|
|
207 |
--i2v-resolution 720p \
|
208 |
--i2v-stability \
|
209 |
--infer-steps 50 \
|
@@ -220,17 +226,17 @@ If you want to generate a more **high-dynamic** video, you can **unset** `--i2v-
|
|
220 |
cd HunyuanVideo-I2V
|
221 |
|
222 |
python3 sample_image2video.py \
|
223 |
-
--prompt "An Asian man with short hair in black tactical uniform and white clothes waves a firework stick." \
|
224 |
-
--i2v-image-path ./demo/imgs/0.jpg \
|
225 |
--model HYVideo-T/2 \
|
|
|
226 |
--i2v-mode \
|
|
|
227 |
--i2v-resolution 720p \
|
228 |
--infer-steps 50 \
|
229 |
--video-length 129 \
|
230 |
--flow-reverse \
|
231 |
--flow-shift 17.0 \
|
232 |
-
--seed 0 \
|
233 |
--embedded-cfg-scale 6.0 \
|
|
|
234 |
--use-cpu-offload \
|
235 |
--save-path ./results
|
236 |
```
|
@@ -256,7 +262,7 @@ We list some more useful configurations for easy usage:
|
|
256 |
|
257 |
|
258 |
|
259 |
-
|
260 |
|
261 |
### Requirements
|
262 |
|
@@ -283,11 +289,13 @@ Prompt description: The trigger word is written directly in the video caption. I
|
|
283 |
|
284 |
For example, AI hair growth effect (trigger): rapid_hair_growth, The hair of the characters in the video is growing rapidly. + original prompt
|
285 |
|
286 |
-
After having the training video and prompt pair, refer to [here](hyvideo/hyvae_extract/README.md) for training data construction.
|
287 |
|
288 |
|
289 |
### Training
|
290 |
```
|
|
|
|
|
291 |
sh scripts/run_train_image2video_lora.sh
|
292 |
```
|
293 |
We list some training specific configurations for easy usage:
|
@@ -303,22 +311,26 @@ After training, you can find `pytorch_lora_kohaya_weights.safetensors` in `{SAVE
|
|
303 |
|
304 |
### Inference
|
305 |
```bash
|
|
|
|
|
306 |
python3 sample_image2video.py \
|
307 |
-
|
308 |
-
|
309 |
-
|
310 |
-
|
311 |
-
|
312 |
-
|
313 |
-
|
314 |
-
|
315 |
-
|
316 |
-
|
317 |
-
|
318 |
-
|
319 |
-
|
320 |
-
|
321 |
-
|
|
|
|
|
322 |
```
|
323 |
We list some lora specific configurations for easy usage:
|
324 |
|
@@ -326,7 +338,7 @@ We list some lora specific configurations for easy usage:
|
|
326 |
|:-------------------:|:-------:|:----------------------------:|
|
327 |
| `--use-lora` | False | Whether to open lora mode. |
|
328 |
| `--lora-scale` | 1.0 | Fusion scale for lora model. |
|
329 |
-
| `--lora-path` | "" | Weight path for lora model. |
|
330 |
|
331 |
|
332 |
## π BibTeX
|
|
|
28 |
* Mar 06, 2025: π We release the inference code and model weights of HunyuanVideo-I2V. [Download](https://github.com/Tencent/HunyuanVideo-I2V/blob/main/ckpts/README.md).
|
29 |
|
30 |
|
31 |
+
<!-- ### Frist Frame Consistency Demo
|
32 |
| Reference Image | Generated Video |
|
33 |
|:----------------:|:----------------:|
|
34 |
| <img src="https://github.com/user-attachments/assets/83e7a097-ffca-40db-9c72-be01d866aa7d" width="80%"> | <video src="https://github.com/user-attachments/assets/f81d2c88-bb1a-43f8-b40f-1ccc20774563" width="100%"> </video> |
|
35 |
ο½ <img src="https://github.com/user-attachments/assets/c385a11f-60c7-4919-b0f1-bc5e715f673c" width="80%"> | <video src="https://github.com/user-attachments/assets/0c29ede9-0481-4d40-9c67-a4b6267fdc2d" width="100%"> </video> |
|
36 |
ο½ <img src="https://github.com/user-attachments/assets/5763f5eb-0be5-4b36-866a-5199e31c5802" width="95%"> | <video src="https://github.com/user-attachments/assets/a8da0a1b-ba7d-45a4-a901-5d213ceaf50e" width="100%"> </video> |
|
37 |
+
-->
|
38 |
<!-- ### Customizable I2V LoRA Demo
|
39 |
|
40 |
| I2V Lora Effect | Reference Image | Generated Video |
|
|
|
65 |
- [x] Inference
|
66 |
- [x] Checkpoints
|
67 |
- [x] ComfyUI
|
68 |
+
- [x] Lora training scripts
|
69 |
- [ ] Multi-gpus Sequence Parallel inference (Faster inference speed on more gpus)
|
70 |
- [ ] Diffusers
|
71 |
- [ ] FP8 Quantified weight
|
|
|
84 |
- [Tips for Using Image-to-Video Models](#tips-for-using-image-to-video-models)
|
85 |
- [Using Command Line](#using-command-line)
|
86 |
- [More Configurations](#more-configurations)
|
87 |
+
- [π Customizable I2V LoRA effects training](#-customizable-i2v-lora-effects-training)
|
88 |
+
- [Requirements](#requirements)
|
89 |
+
- [Environment](#environment)
|
90 |
+
- [Training data construction](#training-data-construction)
|
91 |
+
- [Training](#training)
|
92 |
+
- [Inference](#inference)
|
93 |
- [π BibTeX](#-bibtex)
|
94 |
- [Acknowledgements](#acknowledgements)
|
95 |
---
|
|
|
206 |
cd HunyuanVideo-I2V
|
207 |
|
208 |
python3 sample_image2video.py \
|
|
|
|
|
209 |
--model HYVideo-T/2 \
|
210 |
+
--prompt "An Asian man with short hair in black tactical uniform and white clothes waves a firework stick." \
|
211 |
--i2v-mode \
|
212 |
+
--i2v-image-path ./assets/demo/i2v/imgs/0.jpg \
|
213 |
--i2v-resolution 720p \
|
214 |
--i2v-stability \
|
215 |
--infer-steps 50 \
|
|
|
226 |
cd HunyuanVideo-I2V
|
227 |
|
228 |
python3 sample_image2video.py \
|
|
|
|
|
229 |
--model HYVideo-T/2 \
|
230 |
+
--prompt "An Asian man with short hair in black tactical uniform and white clothes waves a firework stick." \
|
231 |
--i2v-mode \
|
232 |
+
--i2v-image-path ./assets/demo/i2v/imgs/0.jpg \
|
233 |
--i2v-resolution 720p \
|
234 |
--infer-steps 50 \
|
235 |
--video-length 129 \
|
236 |
--flow-reverse \
|
237 |
--flow-shift 17.0 \
|
|
|
238 |
--embedded-cfg-scale 6.0 \
|
239 |
+
--seed 0 \
|
240 |
--use-cpu-offload \
|
241 |
--save-path ./results
|
242 |
```
|
|
|
262 |
|
263 |
|
264 |
|
265 |
+
## π Customizable I2V LoRA effects training
|
266 |
|
267 |
### Requirements
|
268 |
|
|
|
289 |
|
290 |
For example, AI hair growth effect (trigger): rapid_hair_growth, The hair of the characters in the video is growing rapidly. + original prompt
|
291 |
|
292 |
+
After having the training video and prompt pair, refer to [here] (hyvideo/hyvae_extract/README.md) for training data construction.
|
293 |
|
294 |
|
295 |
### Training
|
296 |
```
|
297 |
+
cd HunyuanVideo-I2V
|
298 |
+
|
299 |
sh scripts/run_train_image2video_lora.sh
|
300 |
```
|
301 |
We list some training specific configurations for easy usage:
|
|
|
311 |
|
312 |
### Inference
|
313 |
```bash
|
314 |
+
cd HunyuanVideo-I2V
|
315 |
+
|
316 |
python3 sample_image2video.py \
|
317 |
+
--model HYVideo-T/2 \
|
318 |
+
--prompt "Two people hugged tightly, In the video, two people are standing apart from each other. They then move closer to each other and begin to hug tightly. The hug is very affectionate, with the two people holding each other tightly and looking into each other's eyes. The interaction is very emotional and heartwarming, with the two people expressing their love and affection for each other." \
|
319 |
+
--i2v-mode \
|
320 |
+
--i2v-image-path ./assets/demo/i2v_lora/imgs/embrace.png \
|
321 |
+
--i2v-resolution 720p \
|
322 |
+
--i2v-stability \
|
323 |
+
--infer-steps 50 \
|
324 |
+
--video-length 129 \
|
325 |
+
--flow-reverse \
|
326 |
+
--flow-shift 5.0 \
|
327 |
+
--embedded-cfg-scale 6.0 \
|
328 |
+
--seed 0 \
|
329 |
+
--use-cpu-offload \
|
330 |
+
--save-path ./results \
|
331 |
+
--use-lora \
|
332 |
+
--lora-scale 1.0 \
|
333 |
+
--lora-path ./ckpts/hunyuan-video-i2v-720p/lora/embrace_kohaya_weights.safetensors
|
334 |
```
|
335 |
We list some lora specific configurations for easy usage:
|
336 |
|
|
|
338 |
|:-------------------:|:-------:|:----------------------------:|
|
339 |
| `--use-lora` | False | Whether to open lora mode. |
|
340 |
| `--lora-scale` | 1.0 | Fusion scale for lora model. |
|
341 |
+
| `--lora-path` | "" | Weight path for lora model. |
|
342 |
|
343 |
|
344 |
## π BibTeX
|