Kanana: Compute-efficient Bilingual Language Models Paper β’ 2502.18934 β’ Published 15 days ago β’ 62
Phi-4 Collection Phi-4 family of small language and multi-modal models. β’ 7 items β’ Updated 10 days ago β’ 109
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling Paper β’ 2502.06703 β’ Published about 1 month ago β’ 142
view post Post 2113 π We're excited to share major improvements to our Janus-Pro-7B Text-to-Image Generation Space! π¨What's New:1-Critical Bug Fixes2-Enhanced Features3-UI Improvements4-Performance BoostTry It Now: Bils/DeepseekJanusPro-Image See translation π₯ 8 8 π 1 1 + Reply
view article Article Mastering Long Contexts in LLMs with KVPress By nvidia and 1 other β’ Jan 23 β’ 64
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 β’ 8 items β’ Updated 17 days ago β’ 396
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths β’ 3 items β’ Updated 15 days ago β’ 106
KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models Paper β’ 2412.06071 β’ Published Dec 8, 2024 β’ 9