view post Post 851 Gemma 3 seems to be really good at human preference. Just waiting for ppl to see it. See translation 🔥 5 5 + Reply
A Multimodal Symphony: Integrating Taste and Sound through Generative AI Paper • 2503.02823 • Published 9 days ago • 2
DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion Paper • 2503.01183 • Published 11 days ago • 26
LiteASR: Efficient Automatic Speech Recognition with Low-Rank Approximation Paper • 2502.20583 • Published 14 days ago • 11
view post Post 2712 Wan2.1 🔥📹 new OPEN video model by Alibaba Wan team!Model: Wan-AI/Wan2.1-T2V-14BDemo: Wan-AI/Wan2.1✨Apache 2.0✨8.19GB VRAM, runs on most GPUs✨Multi-Tasking: T2V, I2V, Video Editing, T2I, V2A✨Text Generation: Supports Chinese & English✨Powerful Video VAE: Encode/decode 1080P w/ temporal precision See translation 1 reply · 🔥 11 11 🚀 5 5 👍 4 4 🤯 2 2 + Reply
view post Post 5069 She arrived 😍[Expect more models soon...] See translation 2 replies · 👍 25 25 🚀 1 1 + Reply
SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation Paper • 2502.13128 • Published 23 days ago • 37