view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM 2 days ago • 199
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning Paper • 2502.17407 • Published 17 days ago • 24
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published 21 days ago • 97
Multilingual Machine Translation with Open Large Language Models at Practical Scale: An Empirical Study Paper • 2502.02481 • Published Feb 4 • 10
view article Article Improving Hugging Face Training Efficiency Through Packing with Flash Attention Aug 21, 2024 • 32