1.87k
F5-TTS
π£
F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Generate text from an image and question
Generate images based on input image and prompt
Generate custom images using LoRA models
More advanced and challenging multi-task evaluation
Analyze an image to generate tags and ratings
Annotate and describe images with text prompts
Interact with Florence-2 to analyze images and generate descriptions
Analyze images to generate captions, detect objects, or perform OCR
Chat with an AI that understands text and images
a tiny vision language model
Transcribe audio with emotions and events