view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other • 20 days ago • 62
Kimi k1.5: Scaling Reinforcement Learning with LLMs Paper • 2501.12599 • Published 21 days ago • 91
FuseChat 3.0 Collection Preference Optimization for Implicit Model Fusion • 13 items • Updated 5 days ago • 11
view article Article FuseChat-3.0: Preference Optimization for Implicit Model Fusion By Wanfq and 2 others • Dec 18, 2024 • 5
view article Article SmolVLM Grows Smaller – Introducing the 250M & 500M Models! 20 days ago • 124
SmolVLM 256M & 500M Collection Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 20 days ago • 68
view article Article FuseO1-Preview: System-II Reasoning Fusion of LLMs By Wanfq and 4 others • 22 days ago • 13
FuseO1-Preview Collection System-II Reasoning Fusion of LLMs • 10 items • Updated 12 days ago • 17
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated 6 days ago • 228
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25, 2024 • 91
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 20 days ago • 315
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning Paper • 2501.12570 • Published 21 days ago • 23
view article Article The SOTA Text-to-speech and Zero Shot Voice cloning model that no one knows about... By srinivasbilla • 22 days ago • 60
DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 4 days ago • 168