README.md · jbilcke-hf/LTX-Video-0.9.1-HFIE at main

metadata

tags:
  - ltx-video
  - text-to-video
  - image-to-video
pinned: true
language:
  - en
license: other

This is a special fork of LTX-Video designed to facilitate deployment to the Inference Endpoints, using diffusers + varnish.

If you want to know how to use it once you've deployed it, please refer to this Python demo:

https://huggingface.co/jbilcke-hf/LTX-Video-for-InferenceEndpoints/blob/main/example.py

Why?

Using the Hugging Face Inference Endpoints gives you with reliable and controllable availability.

The goal of this wrapper is to offer flexibility and low-level settings, to add things such as upscaling, interpolation, film grain, and compression controls.

Setup on Hugging Face Inference Endpoints

Pick a large machine

It is recommended to use at least a NVIDIA L40S with 48 Gb of VRAM.

Downloads all assets

Make sure to select "Download everything" when selecting the model.

Otherwise some files ending in .pth won't be downloaded.

Select Text-to-Video or Image-to-Video

By default the handler will do Text-to-Video.

To do Image-Text-To-Video, you need to set the environment variable SUPPORT_INPUT_IMAGE_PROMPT to a trueish value (eg 1, True).

It is possible to support both pipelines at the same time if you modify the handler.py.

But if you keep both pipelines active in parallel, this will consume more memory.

Using private LTX-Video LoRAs

If you plan on using private LoRA models you will have to set the HF_API_TOKEN environment variable.

Credits

For more information about this model, please see the original HF repository here.