Spaces:
Configuration error
Configuration error
Commit
Β·
c7c6869
1
Parent(s):
b0f868c
new
Browse files
README.md
CHANGED
@@ -43,7 +43,7 @@ https://github.com/user-attachments/assets/dc54bc11-48cc-4814-9879-bf2699ee9d1d
|
|
43 |
* **[2025/1/23]** Our paper is accepted to [ICLR2025](https://openreview.net/forum?id=SSslAtcPB6)! Welcome to **watch** π this repository for the latest updates.
|
44 |
|
45 |
|
46 |
-
##
|
47 |
Our method is tested using cuda12.1, fp16 of accelerator and xformers on a single L40.
|
48 |
|
49 |
```bash
|
@@ -68,23 +68,12 @@ You may download all the base model checkpoints using the following bash command
|
|
68 |
bash download_all.sh
|
69 |
```
|
70 |
|
71 |
-
|
72 |
|
73 |
-
|
74 |
-
mkdir annotator/ckpts
|
75 |
-
```
|
76 |
-
Method 1: Download dwpose models
|
77 |
-
|
78 |
-
(Note: if your are avaiable to huggingface, other models like depth_zoe etc can be automatically downloaded)
|
79 |
-
|
80 |
-
Download dwpose model dw-ll_ucoco_384.onnx ([baidu](https://pan.baidu.com/s/1nuBjw-KKSxD_BkpmwXUJiw?pwd=28d7), [google](https://drive.google.com/file/d/12L8E2oAgZy4VACGSK9RaZBZrfgx7VTA2/view?usp=sharing)) and Det model yolox_l.onnx ([baidu](https://pan.baidu.com/s/1fpfIVpv5ypo4c1bUlzkMYQ?pwd=mjdn), [google](https://drive.google.com/file/d/1w9pXC8tT0p9ndMN-CArp1__b2GbzewWI/view?usp=sharing)),
|
81 |
-
Then put them into ./annotator/ckpts.
|
82 |
-
|
83 |
-
Method 2: Download all annotator checkpoints from google or baiduyun (when can not access to huggingface)
|
84 |
-
|
85 |
-
If you cannot access HuggingFace, you can download all the annotator checkpoints (such as DW-Pose, depth_zoe, depth_midas, and OpenPose, cost around 4G.) from [baidu](https://pan.baidu.com/s/1sgBFLFkdTCDTn4oqHjGb9A?pwd=pdm5) or [google](https://drive.google.com/file/d/1qOsmWshnFMMr8x1HteaTViTSQLh_4rle/view?usp=drive_link)
|
86 |
Then extract them into ./annotator/ckpts
|
87 |
|
|
|
88 |
|
89 |
## π Prepare all the data
|
90 |
|
@@ -95,11 +84,12 @@ tar -zxvf videograin_data.tar.gz
|
|
95 |
|
96 |
## π₯ VideoGrain Editing
|
97 |
|
98 |
-
|
|
|
99 |
|
100 |
```bash
|
101 |
bash test.sh
|
102 |
-
|
103 |
```
|
104 |
|
105 |
<details><summary>The result is saved at `./result` . (Click for directory structure) </summary>
|
@@ -107,12 +97,16 @@ bash test.sh
|
|
107 |
```
|
108 |
result
|
109 |
βββ run_two_man
|
|
|
110 |
β βββ infer_samples
|
|
|
|
|
111 |
β βββ sample
|
112 |
-
β βββ step_0
|
113 |
-
β βββ step_0.mp4
|
114 |
-
β βββ source_video.mp4
|
115 |
-
|
|
|
116 |
```
|
117 |
|
118 |
</details>
|
|
|
43 |
* **[2025/1/23]** Our paper is accepted to [ICLR2025](https://openreview.net/forum?id=SSslAtcPB6)! Welcome to **watch** π this repository for the latest updates.
|
44 |
|
45 |
|
46 |
+
## π» Setup Environment
|
47 |
Our method is tested using cuda12.1, fp16 of accelerator and xformers on a single L40.
|
48 |
|
49 |
```bash
|
|
|
68 |
bash download_all.sh
|
69 |
```
|
70 |
|
71 |
+
<details><summary>Click for ControlNet annotator weights (if you can not access to huggingface)</summary>
|
72 |
|
73 |
+
You can download all the annotator checkpoints (such as DW-Pose, depth_zoe, depth_midas, and OpenPose, cost around 4G.) from [baidu](https://pan.baidu.com/s/1sgBFLFkdTCDTn4oqHjGb9A?pwd=pdm5) or [google](https://drive.google.com/file/d/1qOsmWshnFMMr8x1HteaTViTSQLh_4rle/view?usp=drive_link)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
74 |
Then extract them into ./annotator/ckpts
|
75 |
|
76 |
+
</details>
|
77 |
|
78 |
## π Prepare all the data
|
79 |
|
|
|
84 |
|
85 |
## π₯ VideoGrain Editing
|
86 |
|
87 |
+
### Inference
|
88 |
+
VideoGrain is a training-free framework. To run the inference script, use the following command:
|
89 |
|
90 |
```bash
|
91 |
bash test.sh
|
92 |
+
or accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/running_spider_polar_sunglass.yaml
|
93 |
```
|
94 |
|
95 |
<details><summary>The result is saved at `./result` . (Click for directory structure) </summary>
|
|
|
97 |
```
|
98 |
result
|
99 |
βββ run_two_man
|
100 |
+
β βββ control # control conditon
|
101 |
β βββ infer_samples
|
102 |
+
β βββ input # the input video frames
|
103 |
+
β βββ masked_video.mp4 # check whether edit regions are accuratedly covered
|
104 |
β βββ sample
|
105 |
+
β βββ step_0 # result image folder
|
106 |
+
β βββ step_0.mp4 # result video
|
107 |
+
β βββ source_video.mp4 # the input video
|
108 |
+
β βββ visualization_denoise # cross attention weight
|
109 |
+
β βββ sd_study # cluster inversion feature
|
110 |
```
|
111 |
|
112 |
</details>
|
annotator/dwpose/__pycache__/wholebody.cpython-310.pyc
CHANGED
Binary files a/annotator/dwpose/__pycache__/wholebody.cpython-310.pyc and b/annotator/dwpose/__pycache__/wholebody.cpython-310.pyc differ
|
|
annotator/dwpose/wholebody.py
CHANGED
@@ -1,15 +1,32 @@
|
|
1 |
import cv2
|
2 |
import numpy as np
|
3 |
-
|
|
|
4 |
import onnxruntime as ort
|
5 |
from .onnxdet import inference_detector
|
6 |
from .onnxpose import inference_pose
|
|
|
|
|
7 |
|
8 |
class Wholebody:
|
9 |
def __init__(self):
|
10 |
device = 'cuda:0'
|
11 |
providers = ['CPUExecutionProvider'
|
12 |
] if device == 'cpu' else ['CUDAExecutionProvider']
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
13 |
onnx_det = 'annotator/ckpts/yolox_l.onnx'
|
14 |
onnx_pose = 'annotator/ckpts/dw-ll_ucoco_384.onnx'
|
15 |
|
|
|
1 |
import cv2
|
2 |
import numpy as np
|
3 |
+
import os
|
4 |
+
os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE"
|
5 |
import onnxruntime as ort
|
6 |
from .onnxdet import inference_detector
|
7 |
from .onnxpose import inference_pose
|
8 |
+
from annotator.util import annotator_ckpts_path
|
9 |
+
|
10 |
|
11 |
class Wholebody:
|
12 |
def __init__(self):
|
13 |
device = 'cuda:0'
|
14 |
providers = ['CPUExecutionProvider'
|
15 |
] if device == 'cpu' else ['CUDAExecutionProvider']
|
16 |
+
|
17 |
+
remote_dw_pose_path = "https://huggingface.co/sxela/dwpose_ckpts/resolve/main/dw-ll_ucoco_384.onnx"
|
18 |
+
remote_yolox_path = "https://huggingface.co/sxela/dwpose_ckpts/resolve/main/yolox_l.onnx"
|
19 |
+
|
20 |
+
dw_pose_path = os.path.join(annotator_ckpts_path, "dw-ll_ucoco_384.onnx")
|
21 |
+
yolox_path = os.path.join(annotator_ckpts_path, "yolox_l.onnx")
|
22 |
+
|
23 |
+
if not os.path.exists(dw_pose_path):
|
24 |
+
from basicsr.utils.download_util import load_file_from_url
|
25 |
+
load_file_from_url(remote_dw_pose_path, model_dir=annotator_ckpts_path)
|
26 |
+
if not os.path.exists(yolox_path):
|
27 |
+
from basicsr.utils.download_util import load_file_from_url
|
28 |
+
load_file_from_url(remote_yolox_path, model_dir=annotator_ckpts_path)
|
29 |
+
|
30 |
onnx_det = 'annotator/ckpts/yolox_l.onnx'
|
31 |
onnx_pose = 'annotator/ckpts/dw-ll_ucoco_384.onnx'
|
32 |
|
config/instance_level/running_two_man/running_3cls_polar_spider_vis_weight.yaml
CHANGED
@@ -1,5 +1,5 @@
|
|
1 |
pretrained_model_path: "./ckpt/stable-diffusion-v1-5"
|
2 |
-
logdir: ./result/run_two_man/instance_level/
|
3 |
|
4 |
dataset_config:
|
5 |
path: "data/run_two_man/run_two_man_fr2"
|
|
|
1 |
pretrained_model_path: "./ckpt/stable-diffusion-v1-5"
|
2 |
+
logdir: ./result/run_two_man/instance_level/3cls_spider_polar_vis_cross_attn
|
3 |
|
4 |
dataset_config:
|
5 |
path: "data/run_two_man/run_two_man_fr2"
|
requirements.txt
CHANGED
@@ -65,4 +65,5 @@ scikit-learn==1.2.2
|
|
65 |
nltk==3.8.1
|
66 |
timm==0.6.7
|
67 |
scikit-image==0.24.0
|
68 |
-
gdown==5.1.0
|
|
|
|
65 |
nltk==3.8.1
|
66 |
timm==0.6.7
|
67 |
scikit-image==0.24.0
|
68 |
+
gdown==5.1.0
|
69 |
+
basicsr-fixed
|
video_diffusion/common/__pycache__/image_util.cpython-310.pyc
CHANGED
Binary files a/video_diffusion/common/__pycache__/image_util.cpython-310.pyc and b/video_diffusion/common/__pycache__/image_util.cpython-310.pyc differ
|
|