Generate a short video from an image
Convert spoken words into text
Generate text responses from prompts
Engage in multimedia chat with LLMs and ML models