Multimodal Language Model Collection What does matter besides data receipt when training a Multimodal language model? • 30 items • Updated 1 day ago • 1
Ola: Pushing the Frontiers of Omni-Modal Language Model with Progressive Modality Alignment Paper • 2502.04328 • Published 8 days ago • 21
Image / Video Gen Collection Image Generation Using Diffusion-Based Methods: Tips and Techniques for Stable Diffusion • 35 items • Updated 2 days ago • 7
Magic 1-For-1: Generating One Minute Video Clips within One Minute Paper • 2502.07701 • Published 3 days ago • 24
Open Datasets Collection Thank you for sharing your dataset. I’ve fed them to my model, and they are benefit to it. • 17 items • Updated 4 days ago
Image / Video Gen Collection Image Generation Using Diffusion-Based Methods: Tips and Techniques for Stable Diffusion • 35 items • Updated 2 days ago • 7
VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models Paper • 2502.02492 • Published 10 days ago • 51
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 23 days ago • 318
VideoWorld: Exploring Knowledge Learning from Unlabeled Videos Paper • 2501.09781 • Published 29 days ago • 25
Do generative video models learn physical principles from watching videos? Paper • 2501.09038 • Published about 1 month ago • 32
Textoon: Generating Vivid 2D Cartoon Characters from Text Descriptions Paper • 2501.10020 • Published 28 days ago • 22