Forgetting Transformer: Softmax Attention with a Forget Gate Paper • 2503.02130 • Published 10 days ago • 26 • 4
Forgetting Transformer: Softmax Attention with a Forget Gate Paper • 2503.02130 • Published 10 days ago • 26
R1-Zero's "Aha Moment" in Visual Reasoning on a 2B Non-SFT Model Paper • 2503.05132 • Published 7 days ago • 47
CognitiveDrone: A VLA Model and Evaluation Benchmark for Real-Time Cognitive Task Solving and Reasoning in UAVs Paper • 2503.01378 • Published 11 days ago • 3
GEN3C: 3D-Informed World-Consistent Video Generation with Precise Camera Control Paper • 2503.03751 • Published 8 days ago • 19
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification Paper • 2503.02537 • Published 10 days ago • 11 • 3
RectifiedHR: Enable Efficient High-Resolution Image Generation via Energy Rectification Paper • 2503.02537 • Published 10 days ago • 11
DiffRhythm: Blazingly Fast and Embarrassingly Simple End-to-End Full-Length Song Generation with Latent Diffusion Paper • 2503.01183 • Published 11 days ago • 26
Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models Paper • 2503.01774 • Published 11 days ago • 39
Mobius: Text to Seamless Looping Video Generation via Latent Shift Paper • 2502.20307 • Published 15 days ago • 17
Guardians of the Agentic System: Preventing Many Shots Jailbreak with Agentic System Paper • 2502.16750 • Published 18 days ago • 10
Multimodal Representation Alignment for Image Generation: Text-Image Interleaved Control Is Easier Than You Think Paper • 2502.20172 • Published 15 days ago • 27
UniTok: A Unified Tokenizer for Visual Generation and Understanding Paper • 2502.20321 • Published 15 days ago • 29
KV-Edit: Training-Free Image Editing for Precise Background Preservation Paper • 2502.17363 • Published 18 days ago • 33
Language Models' Factuality Depends on the Language of Inquiry Paper • 2502.17955 • Published 17 days ago • 30
SIFT: Grounding LLM Reasoning in Contexts via Stickers Paper • 2502.14922 • Published 23 days ago • 30