FQiao's picture
Upload 70 files
3324de2 verified

A newer version of the Gradio SDK is available: 5.33.2

Upgrade

ComfyUI-TangoFlux

ComfyUI Custom Nodes for "TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching". These nodes, adapted from the official implementations, generates high-quality 44.1kHz audio up to 30 seconds using just a text promptproduction.

Installation

  1. Navigate to your ComfyUI's custom_nodes directory:
cd ComfyUI/custom_nodes
  1. Clone this repository:
git clone https://github.com/declare-lab/TangoFlux  ComfyUI-TangoFlux
  1. Install requirements:
cd ComfyUI-TangoFlux/comfyui
python install.py

Or Install via ComfyUI Manager

Check out some demos from the official demo page

Example Workflow

example_workflow

Usage

All the necessary models should be automatically downloaded when the TangoFluxLoader node is used for the first time.

Models can also be downloaded using the install.py script

models_folder_structure

Manual Download:

  • Download TangoFlux from here into models/tangoflux
  • Download text encoders from here into models/text_encoders/google-flan-t5-large

(Include Everything as shown in the screenshot above. Do Not Rename Anything)

The nodes can be found in "TangoFlux" category as TangoFluxLoader, TangoFluxSampler, TangoFluxVAEDecodeAndPlay.

teacache_options

TeaCache can speedup TangoFlux 2x without much audio quality degradation, in a training-free manner.

πŸ“ˆ Inference Latency Comparisons on a Single A800

TangoFlux TeaCache (0.25) TeaCache (0.4)
~4.08 s ~2.42 s ~1.95 s

Citation

@misc{hung2024tangofluxsuperfastfaithful,
      title={TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization}, 
      author={Chia-Yu Hung and Navonil Majumder and Zhifeng Kong and Ambuj Mehrish and Rafael Valle and Bryan Catanzaro and Soujanya Poria},
      year={2024},
      eprint={2412.21037},
      archivePrefix={arXiv},
      primaryClass={cs.SD},
      url={https://arxiv.org/abs/2412.21037}, 
}
@article{liu2024timestep,
  title={Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model},
  author={Liu, Feng and Zhang, Shiwei and Wang, Xiaofeng and Wei, Yujie and Qiu, Haonan and Zhao, Yuzhong and Zhang, Yingya and Ye, Qixiang and Wan, Fang},
  journal={arXiv preprint arXiv:2411.19108},
  year={2024}
}