DeepSeek R1 (All Versions) Collection DeepSeek R1 - the most powerful reasoning open-source model - available in GGUF, original & 4-bit formats. Includes Llama & Qwen distilled models. • 29 items • Updated 7 days ago • 178
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths • 2 items • Updated 19 days ago • 99
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 19 days ago • 342
SmolVLM 256M & 500M Collection Collection for models & demos for even smoller SmolVLM release • 12 items • Updated 23 days ago • 68
jina-embeddings-v3 Collection Multilingual multi-task general text embedding model • 6 items • Updated Sep 19, 2024 • 20
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining Paper • 2501.00958 • Published Jan 1 • 99
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published Dec 25, 2024 • 97
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference Paper • 2412.13663 • Published Dec 18, 2024 • 131
ModernBERT Collection Bringing BERT into modernity via both architecture changes and scaling • 3 items • Updated Dec 19, 2024 • 134
Sana Collection ⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 21 items • Updated 4 days ago • 88