Spaces:
Configuration error
Configuration error
Commit
·
f7c5396
1
Parent(s):
5aca2b0
update
Browse files- README.md +67 -20
- assets/class-level/bear.gif +3 -0
- assets/class-level/car-1.gif +3 -0
- assets/class-level/husky.gif +3 -0
- assets/class-level/pig.gif +3 -0
- assets/class-level/posche.gif +3 -0
- assets/class-level/tennis.gif +3 -0
- assets/class-level/tennis_1cls.gif +3 -0
- assets/class-level/tennis_3cls.gif +3 -0
- assets/class-level/tiger.gif +3 -0
- assets/class-level/wolf.gif +3 -0
- assets/{bear_weight.gif → vis/bear_weight.gif} +0 -0
- config/part_level/adding_new_object/run_two_man/{running_spider_polar_sunglass.yaml → spider_polar_sunglass.yaml} +0 -0
- test.sh +1 -1
README.md
CHANGED
|
@@ -108,32 +108,20 @@ python image_util/sample_video2frames.py --video_path 'your video path' --output
|
|
| 108 |
We segment videos using our ReLER lab's [SAM-Track](https://github.com/z-x-yang/Segment-and-Track-Anything). I suggest using the `app.py` in SAM-Track for `graio` mode to manually select which region in the video your want to edit. Here, we also provided an script ` image_util/process_webui_mask.py` to process masks from SAM-Track path to VideoGrain path.
|
| 109 |
|
| 110 |
|
| 111 |
-
##
|
| 112 |
|
| 113 |
-
### Inference
|
| 114 |
-
|
| 115 |
-
**🔛prepare your config**
|
| 116 |
-
|
| 117 |
-
VideoGrain is a training-free framework. To run VideoGrain on your video, modify `./config/demo_config.yaml` based on your needs:
|
| 118 |
-
|
| 119 |
-
1. Replace your pretrained model path and controlnet path in your config. you can change the control_type to `dwpose` or `depth_zoe` or `depth`(midas).
|
| 120 |
-
2. Prepare your video frames and layout masks (edit regions) using SAM-Track or SAM2 in dataset config.
|
| 121 |
-
3. Change the `prompt`, and extract each `local prompt` in the editing prompts. the local prompt order should be same as layout masks order.
|
| 122 |
-
4. Your can change flatten resolution with 1->64, 2->16, 4->8. (commonly, flatten at 64 worked best)
|
| 123 |
-
5. To ensure temporal consistency, you can set `use_pnp: True` and `inject_step:5/10`. (Note: pnp>10 steps will be bad for multi-regions editing)
|
| 124 |
-
6. If you want to visualize the cross attn weight, set `vis_cross_attn: True`
|
| 125 |
-
7. If you want to cluster DDIM Inversion spatial temporal video feature, set `cluster_inversion_feature: True`
|
| 126 |
-
|
| 127 |
-
**😍Editing your video**
|
| 128 |
|
| 129 |
```bash
|
| 130 |
bash test.sh
|
| 131 |
#or
|
| 132 |
-
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config
|
| 133 |
```
|
| 134 |
|
| 135 |
-
|
| 136 |
|
|
|
|
| 137 |
```
|
| 138 |
result
|
| 139 |
├── run_two_man
|
|
@@ -150,6 +138,28 @@ result
|
|
| 150 |
```
|
| 151 |
</details>
|
| 152 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 153 |
## 🚀Multi-Grained Video Editing Results
|
| 154 |
|
| 155 |
### 🌈 Multi-Grained Definition
|
|
@@ -207,7 +217,7 @@ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/instance_level
|
|
| 207 |
</tr>
|
| 208 |
</table>
|
| 209 |
|
| 210 |
-
## 🕺
|
| 211 |
You can get part-level video editing results, using the following command:
|
| 212 |
```bash
|
| 213 |
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/modification/man_text_message/blue_shirt.yaml
|
|
@@ -246,6 +256,43 @@ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/modi
|
|
| 246 |
<td width=15% style="text-align:center;">superman </td>
|
| 247 |
<td width=15% style="text-align:center;">superman + sunglasses</td>
|
| 248 |
</tr>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 249 |
</table>
|
| 250 |
|
| 251 |
|
|
@@ -284,7 +331,7 @@ CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/instance_level/
|
|
| 284 |
<td><img src="assets/soely_edit/input.gif"></td>
|
| 285 |
<td><img src="assets/vis/edit.gif"></td>
|
| 286 |
<td><img src="assets/vis/spiderman_weight.gif"></td>
|
| 287 |
-
<td><img src="assets/bear_weight.gif"></td>
|
| 288 |
<td><img src="/assets/vis/cherry_weight.gif"></td>
|
| 289 |
</tr>
|
| 290 |
<tr>
|
|
|
|
| 108 |
We segment videos using our ReLER lab's [SAM-Track](https://github.com/z-x-yang/Segment-and-Track-Anything). I suggest using the `app.py` in SAM-Track for `graio` mode to manually select which region in the video your want to edit. Here, we also provided an script ` image_util/process_webui_mask.py` to process masks from SAM-Track path to VideoGrain path.
|
| 109 |
|
| 110 |
|
| 111 |
+
## 🔥🔥🔥 VideoGrain Editing
|
| 112 |
|
| 113 |
+
### 🎨 Inference
|
| 114 |
+
Your can reproduce the instance + part level results in our teaser by running:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 115 |
|
| 116 |
```bash
|
| 117 |
bash test.sh
|
| 118 |
#or
|
| 119 |
+
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/spider_polar_sunglass.yaml
|
| 120 |
```
|
| 121 |
|
| 122 |
+
For other instance/part/class results in VideoGrain project page or teaser, we provide all the data (video frames and layout masks) and corresponding configs to reproduce, the results is shown in [🚀Multi-Grained Video Editing Results](#multi-grained-video-editing-results).
|
| 123 |
|
| 124 |
+
<details><summary>The result is saved at `./result` . (Click for directory structure) </summary>
|
| 125 |
```
|
| 126 |
result
|
| 127 |
├── run_two_man
|
|
|
|
| 138 |
```
|
| 139 |
</details>
|
| 140 |
|
| 141 |
+
|
| 142 |
+
## Editing guidance for YOUR Video
|
| 143 |
+
### 🔛prepare your config**
|
| 144 |
+
|
| 145 |
+
VideoGrain is a training-free framework. To run VideoGrain on your video, modify `./config/demo_config.yaml` based on your needs:
|
| 146 |
+
|
| 147 |
+
1. Replace your pretrained model path and controlnet path in your config. you can change the control_type to `dwpose` or `depth_zoe` or `depth`(midas).
|
| 148 |
+
2. Prepare your video frames and layout masks (edit regions) using SAM-Track or SAM2 in dataset config.
|
| 149 |
+
3. Change the `prompt`, and extract each `local prompt` in the editing prompts. the local prompt order should be same as layout masks order.
|
| 150 |
+
4. Your can change flatten resolution with 1->64, 2->16, 4->8. (commonly, flatten at 64 worked best)
|
| 151 |
+
5. To ensure temporal consistency, you can set `use_pnp: True` and `inject_step:5/10`. (Note: pnp>10 steps will be bad for multi-regions editing)
|
| 152 |
+
6. If you want to visualize the cross attn weight, set `vis_cross_attn: True`
|
| 153 |
+
7. If you want to cluster DDIM Inversion spatial temporal video feature, set `cluster_inversion_feature: True`
|
| 154 |
+
|
| 155 |
+
### 😍Editing your video**
|
| 156 |
+
|
| 157 |
+
```bash
|
| 158 |
+
bash test.sh
|
| 159 |
+
#or
|
| 160 |
+
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config /path/to/the/config
|
| 161 |
+
```
|
| 162 |
+
|
| 163 |
## 🚀Multi-Grained Video Editing Results
|
| 164 |
|
| 165 |
### 🌈 Multi-Grained Definition
|
|
|
|
| 217 |
</tr>
|
| 218 |
</table>
|
| 219 |
|
| 220 |
+
## 🕺 Part-level Video Editing
|
| 221 |
You can get part-level video editing results, using the following command:
|
| 222 |
```bash
|
| 223 |
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/part_level/modification/man_text_message/blue_shirt.yaml
|
|
|
|
| 256 |
<td width=15% style="text-align:center;">superman </td>
|
| 257 |
<td width=15% style="text-align:center;">superman + sunglasses</td>
|
| 258 |
</tr>
|
| 259 |
+
</table>
|
| 260 |
+
|
| 261 |
+
## 🥳 Class-level Video Editing
|
| 262 |
+
You can get class-level video editing results, using the following command:
|
| 263 |
+
```bash
|
| 264 |
+
CUDA_VISIBLE_DEVICES=0 accelerate launch test.py --config config/class_level/wolf/wolf.yaml
|
| 265 |
+
```
|
| 266 |
+
|
| 267 |
+
<table class="center">
|
| 268 |
+
<tr>
|
| 269 |
+
<td><img src="assets/class-level/wolf.gif"></td>
|
| 270 |
+
<td><img src="assets/class-level/pig.gif"></td>
|
| 271 |
+
<td><img src="assets/class-level/husky.gif"></td>
|
| 272 |
+
<td><img src="assets/class-level/bear.gif"></td>
|
| 273 |
+
<td><img src="assets/class-level/tiger.gif"></td>
|
| 274 |
+
</tr>
|
| 275 |
+
<tr>
|
| 276 |
+
<td width=15% style="text-align:center;">input</td>
|
| 277 |
+
<td width=15% style="text-align:center;">pig</td>
|
| 278 |
+
<td width=15% style="text-align:center;">husky</td>
|
| 279 |
+
<td width=15% style="text-align:center;">bear</td>
|
| 280 |
+
<td width=15% style="text-align:center;">tiger</td>
|
| 281 |
+
</tr>
|
| 282 |
+
<tr>
|
| 283 |
+
<td><img src="assets/class-level/tennis.gif"></td>
|
| 284 |
+
<td><img src="assets/class-level/tennis_1cls.gif"></td>
|
| 285 |
+
<td><img src="assets/class-level/tennis_3cls.gif"></td>
|
| 286 |
+
<td><img src="assets/class-level/car-1.gif"></td>
|
| 287 |
+
<td><img src="assets/class-level/posche.gif"></td>
|
| 288 |
+
</tr>
|
| 289 |
+
<tr>
|
| 290 |
+
<td width=15% style="text-align:center;">input</td>
|
| 291 |
+
<td width=15% style="text-align:center;">iron man</td>
|
| 292 |
+
<td width=15% style="text-align:center;">Batman + snow court + iced wall</td>
|
| 293 |
+
<td width=15% style="text-align:center;">input </td>
|
| 294 |
+
<td width=15% style="text-align:center;">posche</td>
|
| 295 |
+
</tr>
|
| 296 |
</table>
|
| 297 |
|
| 298 |
|
|
|
|
| 331 |
<td><img src="assets/soely_edit/input.gif"></td>
|
| 332 |
<td><img src="assets/vis/edit.gif"></td>
|
| 333 |
<td><img src="assets/vis/spiderman_weight.gif"></td>
|
| 334 |
+
<td><img src="assets/vis/bear_weight.gif"></td>
|
| 335 |
<td><img src="/assets/vis/cherry_weight.gif"></td>
|
| 336 |
</tr>
|
| 337 |
<tr>
|
assets/class-level/bear.gif
ADDED
|
Git LFS Details
|
assets/class-level/car-1.gif
ADDED
|
Git LFS Details
|
assets/class-level/husky.gif
ADDED
|
Git LFS Details
|
assets/class-level/pig.gif
ADDED
|
Git LFS Details
|
assets/class-level/posche.gif
ADDED
|
Git LFS Details
|
assets/class-level/tennis.gif
ADDED
|
Git LFS Details
|
assets/class-level/tennis_1cls.gif
ADDED
|
Git LFS Details
|
assets/class-level/tennis_3cls.gif
ADDED
|
Git LFS Details
|
assets/class-level/tiger.gif
ADDED
|
Git LFS Details
|
assets/class-level/wolf.gif
ADDED
|
Git LFS Details
|
assets/{bear_weight.gif → vis/bear_weight.gif}
RENAMED
|
File without changes
|
config/part_level/adding_new_object/run_two_man/{running_spider_polar_sunglass.yaml → spider_polar_sunglass.yaml}
RENAMED
|
File without changes
|
test.sh
CHANGED
|
@@ -1,2 +1,2 @@
|
|
| 1 |
export CUDA_VISIBLE_DEVICES=0
|
| 2 |
-
accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/
|
|
|
|
| 1 |
export CUDA_VISIBLE_DEVICES=0
|
| 2 |
+
accelerate launch test.py --config config/part_level/adding_new_object/run_two_man/spider_polar_sunglass.yaml
|