Update README.md
Browse files
README.md
CHANGED
|
@@ -24,7 +24,7 @@ library_name: diffusers
|
|
| 24 |
<a href=https://huggingface.co/collections/ByteDance/video-as-prompt target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
|
| 25 |
<a href=https://huggingface.co/datasets/BianYx/VAP-Data target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Dataset-276cb4.svg height=22px></a>
|
| 26 |
<a href=https://github.com/bytedance/Video-As-Prompt target="_blank"><img src= https://img.shields.io/badge/Code-black.svg?logo=github height=22px></a>
|
| 27 |
-
<a href=https://
|
| 28 |
<!-- <a href=https://yxbian23.github.io/ target="_blank"><img src=https://img.shields.io/badge/Twitter-grey.svg?logo=x height=22px></a> -->
|
| 29 |
<!-- <a href="https://opensource.org/licenses/Apache">
|
| 30 |
<img src="https://img.shields.io/badge/License-Apache%202.0-lightgray">
|
|
@@ -40,7 +40,7 @@ library_name: diffusers
|
|
| 40 |
|
| 41 |
- Oct 24, 2025: π We release the first unified semantic video generation model, [Video-As-Prompt (VAP)](https://github.com/bytedance/Video-As-Prompt)!
|
| 42 |
- Oct 24, 2025: π€ We release the [VAP-Data](https://huggingface.co/datasets/BianYx/VAP-Data), the largest semantic-controlled video generation datasets with more than $100K$ samples!
|
| 43 |
-
- Oct 24, 2025: π We present the [technical report](https://
|
| 44 |
|
| 45 |
|
| 46 |
|
|
@@ -243,26 +243,21 @@ bash examples/training/sft/cogvideox/vap_mot/train_multi_node.sh xxx:xxx:xxx:xxx
|
|
| 243 |
* All scripts read shared config (datasets, output dir, batch size, etc.); edit the script to override.
|
| 244 |
* Please edit `train_multi_node*.sh` base on your environment if you want to change the distributed settings (e.g., gpu num, node num, master addr/port, etc.).
|
| 245 |
|
| 246 |
-
<!--
|
| 247 |
## π BibTeX
|
| 248 |
|
| 249 |
-
If you found this repository helpful, please cite our report:
|
| 250 |
|
| 251 |
```bibtex
|
| 252 |
-
|
| 253 |
-
|
| 254 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 255 |
## Acknowledgements
|
| 256 |
|
| 257 |
We would like to thank the contributors to the [Finetrainers](https://github.com/huggingface/finetrainers), [Diffusers](https://github.com/huggingface/diffusers), [CogVideoX](https://github.com/zai-org/CogVideo), and [Wan](https://github.com/Wan-Video/Wan2.1) repositories, for their open research and exploration.
|
| 258 |
|
| 259 |
|
| 260 |
-
<!-- ## Star History
|
| 261 |
-
|
| 262 |
-
<a href="https://star-history.com/#bytedance/Video-As-Prompt&Date">
|
| 263 |
-
<picture>
|
| 264 |
-
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=bytedance/Video-As-Prompt&type=Date&theme=dark" />
|
| 265 |
-
<source media="(prefers-color-scheme: light)" srcset="https://api.star-history.com/svg?repos=bytedance/Video-As-Prompt&type=Date" />
|
| 266 |
-
<img alt="Star History Chart" src="https://api.star-history.com/svg?repos=bytedance/Video-As-Prompt&type=Date" />
|
| 267 |
-
</picture>
|
| 268 |
-
</a> -->
|
|
|
|
| 24 |
<a href=https://huggingface.co/collections/ByteDance/video-as-prompt target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Models-d96902.svg height=22px></a>
|
| 25 |
<a href=https://huggingface.co/datasets/BianYx/VAP-Data target="_blank"><img src=https://img.shields.io/badge/%F0%9F%A4%97%20Dataset-276cb4.svg height=22px></a>
|
| 26 |
<a href=https://github.com/bytedance/Video-As-Prompt target="_blank"><img src= https://img.shields.io/badge/Code-black.svg?logo=github height=22px></a>
|
| 27 |
+
<a href=https://arxiv.org/pdf/2510.20888 target="_blank"><img src=https://img.shields.io/badge/Arxiv-b5212f.svg?logo=arxiv height=22px></a>
|
| 28 |
<!-- <a href=https://yxbian23.github.io/ target="_blank"><img src=https://img.shields.io/badge/Twitter-grey.svg?logo=x height=22px></a> -->
|
| 29 |
<!-- <a href="https://opensource.org/licenses/Apache">
|
| 30 |
<img src="https://img.shields.io/badge/License-Apache%202.0-lightgray">
|
|
|
|
| 40 |
|
| 41 |
- Oct 24, 2025: π We release the first unified semantic video generation model, [Video-As-Prompt (VAP)](https://github.com/bytedance/Video-As-Prompt)!
|
| 42 |
- Oct 24, 2025: π€ We release the [VAP-Data](https://huggingface.co/datasets/BianYx/VAP-Data), the largest semantic-controlled video generation datasets with more than $100K$ samples!
|
| 43 |
+
- Oct 24, 2025: π We present the [technical report](https://arxiv.org/pdf/2510.20888) of Video-As-Prompt, please check out the details and spark some discussion!
|
| 44 |
|
| 45 |
|
| 46 |
|
|
|
|
| 243 |
* All scripts read shared config (datasets, output dir, batch size, etc.); edit the script to override.
|
| 244 |
* Please edit `train_multi_node*.sh` base on your environment if you want to change the distributed settings (e.g., gpu num, node num, master addr/port, etc.).
|
| 245 |
|
|
|
|
| 246 |
## π BibTeX
|
| 247 |
|
| 248 |
+
β€οΈ If you found this repository helpful, please give us a star and cite our report:
|
| 249 |
|
| 250 |
```bibtex
|
| 251 |
+
@article{bian2025videoasprompt,
|
| 252 |
+
title = {Video-As-Prompt: Unified Semantic Control for Video Generation},
|
| 253 |
+
author = {Yuxuan Bian and Xin Chen and Zenan Li and Tiancheng Zhi and Shen Sang and Linjie Luo and Qiang Xu},
|
| 254 |
+
journal = {arXiv preprint arXiv:2510.20888},
|
| 255 |
+
year = {2025},
|
| 256 |
+
url = {https://arxiv.org/abs/2510.20888}
|
| 257 |
+
}
|
| 258 |
+
```
|
| 259 |
## Acknowledgements
|
| 260 |
|
| 261 |
We would like to thank the contributors to the [Finetrainers](https://github.com/huggingface/finetrainers), [Diffusers](https://github.com/huggingface/diffusers), [CogVideoX](https://github.com/zai-org/CogVideo), and [Wan](https://github.com/Wan-Video/Wan2.1) repositories, for their open research and exploration.
|
| 262 |
|
| 263 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|