CoT-Valve: Length-Compressible Chain-of-Thought Tuning Paper • 2502.09601 • Published about 19 hours ago • 8
SelfCite: Self-Supervised Alignment for Context Attribution in Large Language Models Paper • 2502.09604 • Published about 19 hours ago • 19
Skrr: Skip and Re-use Text Encoder Layers for Memory Efficient Text-to-Image Generation Paper • 2502.08690 • Published 2 days ago • 27
InfiniteHiP: Extending Language Model Context Up to 3 Million Tokens on a Single GPU Paper • 2502.08910 • Published 1 day ago • 50
Next Block Prediction: Video Generation via Semi-Autoregressive Modeling Paper • 2502.07737 • Published 3 days ago • 7
DPO-Shift: Shifting the Distribution of Direct Preference Optimization Paper • 2502.07599 • Published 3 days ago • 11
LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid Paper • 2502.07563 • Published 3 days ago • 20
Light-A-Video: Training-free Video Relighting via Progressive Light Fusion Paper • 2502.08590 • Published 2 days ago • 34
Fino1: On the Transferability of Reasoning Enhanced LLMs to Finance Paper • 2502.08127 • Published 2 days ago • 42
Hephaestus: Improving Fundamental Agent Capabilities of Large Language Models through Continual Pre-Training Paper • 2502.06589 • Published 4 days ago • 14
Sparse Autoencoders for Scientifically Rigorous Interpretation of Vision Models Paper • 2502.06755 • Published 4 days ago • 5
CoS: Chain-of-Shot Prompting for Long Video Understanding Paper • 2502.06428 • Published 4 days ago • 8
Teaching Language Models to Critique via Reinforcement Learning Paper • 2502.03492 • Published 10 days ago • 21
LLMs Can Easily Learn to Reason from Demonstrations Structure, not content, is what matters! Paper • 2502.07374 • Published 3 days ago • 27
Retrieval-augmented Large Language Models for Financial Time Series Forecasting Paper • 2502.05878 • Published 5 days ago • 32
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction Paper • 2502.07316 • Published 3 days ago • 28