Spaces:

jbilcke-hf
/

VideoModelStudio

Paused

App Files Files Community

VideoModelStudio / README.md

Julian Bilcke

fix the readme

b722d84 9 months ago

preview code

raw

history blame

3.95 kB

	---
	title: Video Model Studio
	emoji: 🎥
	colorFrom: gray
	colorTo: gray
	sdk: gradio
	sdk_version: 5.15.0
	app_file: app.py
	pinned: true
	license: apache-2.0
	short_description: All-in-one tool for AI video training
	---

	# 🎥 Video Model Studio (VMS)

	## Presentation

	### What is this project?

	VMS is a Gradio app that wraps around Finetrainers, to provide a simple UI to train AI video models on Hugging Face.

	You can deploy it to a private space, and start long-running training jobs in the background.

	### One-user-per-space design

	Currently CMS can only support one training job at a time, anybody with access to your Gradio app will be able to upload or delete everything etc.

	This means you have to run VMS in a PRIVATE HF Space, or locally if you require full privacy.

	### Similar projects

	I wasn't aware of its existence when I started my project, but there is also this open-source initiative: https://github.com/alisson-anjos/diffusion-pipe-ui

	## Features

	### Run Finetrainers in the background

	The main feature of VMS is the ability to run a Finetrainers training session in the background.

	You can start your job, close the web browser tab, and come back the next morning to see the result.

	### Automatic scene splitting

	VMS uses PySceneDetect to split scenes.

	### Automatic clip captioning

	VMS uses `LLaVA-Video-7B-Qwen2` for captioning. You can customize the system prompt if you want to.

	### Download your dataset

	Not interested in using VMS for training? That's perfectly fine!

	You can use VMS for video splitting and captioning, and export the data for training on another platform eg. on Replicate or Fal.

	## Supported models

	VMS uses `Finetrainers` under the hood. In theory any model supported by Finetrainers should work in VMS.

	In practice, a PR (pull request) will be necessary to adapt the UI a bit to accomodate for each model specificities.

	### LTX-Video

	I have tested training a LoRA model using videos, on a single A100 instance.

	### HunyuanVideo

	I haven't tested it yet, but in theory it should work out of the box.
	Please keep in mind that this requires a lot of processing mower.

	### CogVideoX

	Do you want support for this one? Let me know in the comments!

	## Deployment

	VMS is built on top of Finetrainers and Gradio, and designed to run as a Hugging Face Space (but you can deploy it anywhere that has a NVIDIA GPU and supports Docker).

	### Full installation at Hugging Face

	Easy peasy: create a Space (make sure to use the `Gradio` type/template), and push the repo. No Docker needed!

	That said, please see the "RUN" section for info about environement variables.

	### Dev mode on Hugging Face

	Enable dev mode in the space, then open VSCode in local or remote and run:

	```
	pip install -r requirements.txt
	```

	As this is not automatic, then click on "Restart" in the space dev mode UI widget.

	### Full installation somewhere else

	I haven't tested it, but you can try to provided Dockerfile

	### Full installation in local

	the full installation requires:
	- Linux
	- CUDA 12
	- Python 3.10

	This is because of flash attention, which is defined in the `requirements.txt` using an URL to download a prebuilt wheel (python bindings for a native library)

	```bash
	./setup.sh
	```

	### Degraded installation in local

	If you cannot meet the requirements, you can:

	- solution 1: fix requirements.txt to use another prebuilt wheel
	- solution 2: manually build/install flash attention
	- solution 3: don't use clip captioning

	Here is how to do solution 3:
	```bash
	./setup_no_captions.sh
	```

	## Run

	### Running the Gradio app

	Note: please make sure you properly define the environment variables for `STORAGE_PATH` (eg. `/data/`) and `HF_HOME` (eg. `/data/huggingface/`)

	```bash
	python app.py
	```

	### Running locally

	See above remarks about the environment variable.

	By default `run.sh` will store stuff in `.data/` (located inside the current working directory):

	```bash
	./run.sh
	```