tags:
- ltx-video
- text-to-video
- image-to-video
pinned: true
language:
- en
license: other
This is a special fork of LTX-Video designed to facilitate deployment to the Inference Endpoints, using diffusers + varnish.
If you want to know how to use it once you've deployed it, please refer to this Python demo:
https://huggingface.co/jbilcke-hf/LTX-Video-for-InferenceEndpoints/blob/main/example.py
Why?
Using the Hugging Face Inference Endpoints gives you with reliable and controllable availability.
The goal of this wrapper is to offer flexibility and low-level settings, to add things such as upscaling, interpolation, film grain, and compression controls.
Setup on Hugging Face Inference Endpoints
Pick a large machine
It is recommended to use at least a NVIDIA L40S with 48 Gb of VRAM.
Downloads all assets
Make sure to select "Download everything" when selecting the model.
Otherwise some files ending in .pth
won't be downloaded.
Select Text-to-Video or Image-to-Video
By default the handler will do Text-to-Video.
To do Image-Text-To-Video, you need to set the environment variable SUPPORT_INPUT_IMAGE_PROMPT
to a trueish value (eg 1
, True
).
It is possible to support both pipelines at the same time if you modify the handler.py
.
But if you keep both pipelines active in parallel, this will consume more memory.
Using private LTX-Video LoRAs
If you plan on using private LoRA models you will have to set the HF_API_TOKEN
environment variable.
Credits
For more information about this model, please see the original HF repository here.