Submitted by akhaliq 26 LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model · 5 authors 2
Submitted by akhaliq 23 CameraCtrl: Enabling Camera Control for Text-to-Video Generation · 7 authors 1
Submitted by akhaliq 22 Bigger is not Always Better: Scaling Properties of Latent Diffusion Models · 6 authors 1
Submitted by akhaliq 8 LLM-ABR: Designing Adaptive Bitrate Algorithms via Large Language Models · 7 authors 1