Submitted by akhaliq 63 Understanding LLMs: A Comprehensive Overview from Training to Inference · 21 authors 2
Submitted by akhaliq 31 Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation · 3 authors 3
Submitted by akhaliq 31 Instruct-Imagen: Image Generation with Multi-modal Instruction · 12 authors 3
Submitted by akhaliq 16 LLaVA-$φ$: Efficient Multi-Modal Assistant with Small Language Model · 6 authors 4
Submitted by akhaliq 14 What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs · 8 authors 1
Submitted by akhaliq 11 ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers · 7 authors 1
Submitted by akhaliq 8 Improving Diffusion-Based Image Synthesis with Context Prediction · 8 authors 1
Submitted by akhaliq 8 FMGS: Foundation Model Embedded 3D Gaussian Splatting for Holistic 3D Scene Understanding · 5 authors 1
Submitted by akhaliq 7 Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers · 3 authors 1