Running on CPU Upgrade 183 183 MMLU-Pro Leaderboard 🥇 More advanced and challenging multi-task evaluation
Steel-LLM:From Scratch to Open Source -- A Personal Journey in Building a Chinese-Centric LLM Paper • 2502.06635 • Published 1 day ago • 4
Generating Symbolic World Models via Test-time Scaling of Large Language Models Paper • 2502.04728 • Published 5 days ago • 15