Gemstones: A Model Suite for Multi-Faceted Scaling Laws Paper • 2502.06857 • Published 12 days ago • 22
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection Paper • 2403.03507 • Published Mar 6, 2024 • 185 • 15
From Pixels to Prose: A Large Dataset of Dense Image Captions Paper • 2406.10328 • Published Jun 14, 2024 • 18