Image-to-Video
noaltian commited on
Commit
e405da9
Β·
verified Β·
1 Parent(s): 56d4f04

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -26
README.md CHANGED
@@ -28,13 +28,13 @@ This repo contains offical PyTorch model definitions, pre-trained weights and in
28
  * Mar 06, 2025: πŸ‘‹ We release the inference code and model weights of HunyuanVideo-I2V. [Download](https://github.com/Tencent/HunyuanVideo-I2V/blob/main/ckpts/README.md).
29
 
30
 
31
- ### Frist Frame Consistency Demo
32
  | Reference Image | Generated Video |
33
  |:----------------:|:----------------:|
34
  | <img src="https://github.com/user-attachments/assets/83e7a097-ffca-40db-9c72-be01d866aa7d" width="80%"> | <video src="https://github.com/user-attachments/assets/f81d2c88-bb1a-43f8-b40f-1ccc20774563" width="100%"> </video> |
35
  | <img src="https://github.com/user-attachments/assets/c385a11f-60c7-4919-b0f1-bc5e715f673c" width="80%"> | <video src="https://github.com/user-attachments/assets/0c29ede9-0481-4d40-9c67-a4b6267fdc2d" width="100%"> </video> |
36
  | <img src="https://github.com/user-attachments/assets/5763f5eb-0be5-4b36-866a-5199e31c5802" width="95%"> | <video src="https://github.com/user-attachments/assets/a8da0a1b-ba7d-45a4-a901-5d213ceaf50e" width="100%"> </video> |
37
-
38
  <!-- ### Customizable I2V LoRA Demo
39
 
40
  | I2V Lora Effect | Reference Image | Generated Video |
@@ -65,7 +65,7 @@ This repo contains offical PyTorch model definitions, pre-trained weights and in
65
  - [x] Inference
66
  - [x] Checkpoints
67
  - [x] ComfyUI
68
- - [ ] Lora training scripts
69
  - [ ] Multi-gpus Sequence Parallel inference (Faster inference speed on more gpus)
70
  - [ ] Diffusers
71
  - [ ] FP8 Quantified weight
@@ -84,6 +84,12 @@ This repo contains offical PyTorch model definitions, pre-trained weights and in
84
  - [Tips for Using Image-to-Video Models](#tips-for-using-image-to-video-models)
85
  - [Using Command Line](#using-command-line)
86
  - [More Configurations](#more-configurations)
 
 
 
 
 
 
87
  - [πŸ”— BibTeX](#-bibtex)
88
  - [Acknowledgements](#acknowledgements)
89
  ---
@@ -200,10 +206,10 @@ If you want to generate a more **stable** video, you can set `--i2v-stability` a
200
  cd HunyuanVideo-I2V
201
 
202
  python3 sample_image2video.py \
203
- --prompt "An Asian man with short hair in black tactical uniform and white clothes waves a firework stick." \
204
- --i2v-image-path ./demo/imgs/0.jpg \
205
  --model HYVideo-T/2 \
 
206
  --i2v-mode \
 
207
  --i2v-resolution 720p \
208
  --i2v-stability \
209
  --infer-steps 50 \
@@ -220,17 +226,17 @@ If you want to generate a more **high-dynamic** video, you can **unset** `--i2v-
220
  cd HunyuanVideo-I2V
221
 
222
  python3 sample_image2video.py \
223
- --prompt "An Asian man with short hair in black tactical uniform and white clothes waves a firework stick." \
224
- --i2v-image-path ./demo/imgs/0.jpg \
225
  --model HYVideo-T/2 \
 
226
  --i2v-mode \
 
227
  --i2v-resolution 720p \
228
  --infer-steps 50 \
229
  --video-length 129 \
230
  --flow-reverse \
231
  --flow-shift 17.0 \
232
- --seed 0 \
233
  --embedded-cfg-scale 6.0 \
 
234
  --use-cpu-offload \
235
  --save-path ./results
236
  ```
@@ -256,7 +262,7 @@ We list some more useful configurations for easy usage:
256
 
257
 
258
 
259
- <!-- ## πŸŽ‰ Customizable I2V LoRA effects training
260
 
261
  ### Requirements
262
 
@@ -283,11 +289,13 @@ Prompt description: The trigger word is written directly in the video caption. I
283
 
284
  For example, AI hair growth effect (trigger): rapid_hair_growth, The hair of the characters in the video is growing rapidly. + original prompt
285
 
286
- After having the training video and prompt pair, refer to [here](hyvideo/hyvae_extract/README.md) for training data construction.
287
 
288
 
289
  ### Training
290
  ```
 
 
291
  sh scripts/run_train_image2video_lora.sh
292
  ```
293
  We list some training specific configurations for easy usage:
@@ -303,22 +311,26 @@ After training, you can find `pytorch_lora_kohaya_weights.safetensors` in `{SAVE
303
 
304
  ### Inference
305
  ```bash
 
 
306
  python3 sample_image2video.py \
307
- --model HYVideo-T/2 \
308
- --prompt "Two people hugged tightly, In the video, two people are standing apart from each other. They then move closer to each other and begin to hug tightly. The hug is very affectionate, with the two people holding each other tightly and looking into each other's eyes. The interaction is very emotional and heartwarming, with the two people expressing their love and affection for each other." \
309
- --i2v-mode \
310
- --i2v-image-path ./assets/demo/i2v_lora/imgs/embrace.png \
311
- --i2v-resolution 720p \
312
- --infer-steps 50 \
313
- --video-length 129 \
314
- --flow-reverse \
315
- --flow-shift 5.0 \
316
- --seed 0 \
317
- --use-cpu-offload \
318
- --save-path ./results \
319
- --use-lora \
320
- --lora-scale 1.0 \
321
- --lora-path ./ckpts/hunyuan-video-i2v-720p/lora/embrace_kohaya_weights.safetensors
 
 
322
  ```
323
  We list some lora specific configurations for easy usage:
324
 
@@ -326,7 +338,7 @@ We list some lora specific configurations for easy usage:
326
  |:-------------------:|:-------:|:----------------------------:|
327
  | `--use-lora` | False | Whether to open lora mode. |
328
  | `--lora-scale` | 1.0 | Fusion scale for lora model. |
329
- | `--lora-path` | "" | Weight path for lora model. | -->
330
 
331
 
332
  ## πŸ”— BibTeX
 
28
  * Mar 06, 2025: πŸ‘‹ We release the inference code and model weights of HunyuanVideo-I2V. [Download](https://github.com/Tencent/HunyuanVideo-I2V/blob/main/ckpts/README.md).
29
 
30
 
31
+ <!-- ### Frist Frame Consistency Demo
32
  | Reference Image | Generated Video |
33
  |:----------------:|:----------------:|
34
  | <img src="https://github.com/user-attachments/assets/83e7a097-ffca-40db-9c72-be01d866aa7d" width="80%"> | <video src="https://github.com/user-attachments/assets/f81d2c88-bb1a-43f8-b40f-1ccc20774563" width="100%"> </video> |
35
  | <img src="https://github.com/user-attachments/assets/c385a11f-60c7-4919-b0f1-bc5e715f673c" width="80%"> | <video src="https://github.com/user-attachments/assets/0c29ede9-0481-4d40-9c67-a4b6267fdc2d" width="100%"> </video> |
36
  | <img src="https://github.com/user-attachments/assets/5763f5eb-0be5-4b36-866a-5199e31c5802" width="95%"> | <video src="https://github.com/user-attachments/assets/a8da0a1b-ba7d-45a4-a901-5d213ceaf50e" width="100%"> </video> |
37
+ -->
38
  <!-- ### Customizable I2V LoRA Demo
39
 
40
  | I2V Lora Effect | Reference Image | Generated Video |
 
65
  - [x] Inference
66
  - [x] Checkpoints
67
  - [x] ComfyUI
68
+ - [x] Lora training scripts
69
  - [ ] Multi-gpus Sequence Parallel inference (Faster inference speed on more gpus)
70
  - [ ] Diffusers
71
  - [ ] FP8 Quantified weight
 
84
  - [Tips for Using Image-to-Video Models](#tips-for-using-image-to-video-models)
85
  - [Using Command Line](#using-command-line)
86
  - [More Configurations](#more-configurations)
87
+ - [πŸŽ‰ Customizable I2V LoRA effects training](#-customizable-i2v-lora-effects-training)
88
+ - [Requirements](#requirements)
89
+ - [Environment](#environment)
90
+ - [Training data construction](#training-data-construction)
91
+ - [Training](#training)
92
+ - [Inference](#inference)
93
  - [πŸ”— BibTeX](#-bibtex)
94
  - [Acknowledgements](#acknowledgements)
95
  ---
 
206
  cd HunyuanVideo-I2V
207
 
208
  python3 sample_image2video.py \
 
 
209
  --model HYVideo-T/2 \
210
+ --prompt "An Asian man with short hair in black tactical uniform and white clothes waves a firework stick." \
211
  --i2v-mode \
212
+ --i2v-image-path ./assets/demo/i2v/imgs/0.jpg \
213
  --i2v-resolution 720p \
214
  --i2v-stability \
215
  --infer-steps 50 \
 
226
  cd HunyuanVideo-I2V
227
 
228
  python3 sample_image2video.py \
 
 
229
  --model HYVideo-T/2 \
230
+ --prompt "An Asian man with short hair in black tactical uniform and white clothes waves a firework stick." \
231
  --i2v-mode \
232
+ --i2v-image-path ./assets/demo/i2v/imgs/0.jpg \
233
  --i2v-resolution 720p \
234
  --infer-steps 50 \
235
  --video-length 129 \
236
  --flow-reverse \
237
  --flow-shift 17.0 \
 
238
  --embedded-cfg-scale 6.0 \
239
+ --seed 0 \
240
  --use-cpu-offload \
241
  --save-path ./results
242
  ```
 
262
 
263
 
264
 
265
+ ## πŸŽ‰ Customizable I2V LoRA effects training
266
 
267
  ### Requirements
268
 
 
289
 
290
  For example, AI hair growth effect (trigger): rapid_hair_growth, The hair of the characters in the video is growing rapidly. + original prompt
291
 
292
+ After having the training video and prompt pair, refer to [here] (hyvideo/hyvae_extract/README.md) for training data construction.
293
 
294
 
295
  ### Training
296
  ```
297
+ cd HunyuanVideo-I2V
298
+
299
  sh scripts/run_train_image2video_lora.sh
300
  ```
301
  We list some training specific configurations for easy usage:
 
311
 
312
  ### Inference
313
  ```bash
314
+ cd HunyuanVideo-I2V
315
+
316
  python3 sample_image2video.py \
317
+ --model HYVideo-T/2 \
318
+ --prompt "Two people hugged tightly, In the video, two people are standing apart from each other. They then move closer to each other and begin to hug tightly. The hug is very affectionate, with the two people holding each other tightly and looking into each other's eyes. The interaction is very emotional and heartwarming, with the two people expressing their love and affection for each other." \
319
+ --i2v-mode \
320
+ --i2v-image-path ./assets/demo/i2v_lora/imgs/embrace.png \
321
+ --i2v-resolution 720p \
322
+ --i2v-stability \
323
+ --infer-steps 50 \
324
+ --video-length 129 \
325
+ --flow-reverse \
326
+ --flow-shift 5.0 \
327
+ --embedded-cfg-scale 6.0 \
328
+ --seed 0 \
329
+ --use-cpu-offload \
330
+ --save-path ./results \
331
+ --use-lora \
332
+ --lora-scale 1.0 \
333
+ --lora-path ./ckpts/hunyuan-video-i2v-720p/lora/embrace_kohaya_weights.safetensors
334
  ```
335
  We list some lora specific configurations for easy usage:
336
 
 
338
  |:-------------------:|:-------:|:----------------------------:|
339
  | `--use-lora` | False | Whether to open lora mode. |
340
  | `--lora-scale` | 1.0 | Fusion scale for lora model. |
341
+ | `--lora-path` | "" | Weight path for lora model. |
342
 
343
 
344
  ## πŸ”— BibTeX