Submitted by akhaliq 74 Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild · 9 authors 15
Submitted by akhaliq 30 WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models · 8 authors 4
Submitted by akhaliq 13 BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models · 4 authors 1
Submitted by akhaliq 12 SpacTor-T5: Pre-training T5 Models with Span Corruption and Replaced Token Detection · 9 authors 2
Submitted by akhaliq 11 ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models · 4 authors 1
Submitted by akhaliq 11 UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion · 4 authors 3
Submitted by akhaliq 10 CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion · 8 authors 1