Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2412.10302

Deepseek Papers

Deepseek papers collection

DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior

Paper • 2310.16818 • Published Oct 25, 2023 • 32
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5, 2024 • 45
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Paper • 2401.06066 • Published Jan 11, 2024 • 51
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 60

DeepSeek-V3 Technical Report

Paper • 2412.19437 • Published Dec 27, 2024 • 55
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Paper • 2412.10302 • Published Dec 13, 2024 • 17

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Paper • 2412.10302 • Published Dec 13, 2024 • 17

deepseek papers

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

Paper • 2401.14196 • Published Jan 25, 2024 • 60
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5, 2024 • 45
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 107
DeepSeek-VL: Towards Real-World Vision-Language Understanding

Paper • 2403.05525 • Published Mar 8, 2024 • 44

DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data

Paper • 2405.14333 • Published May 23, 2024 • 40
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search

Paper • 2408.08152 • Published Aug 15, 2024 • 56
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 346
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

Paper • 2402.03300 • Published Feb 5, 2024 • 107

Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 53
s1: Simple test-time scaling

Paper • 2501.19393 • Published Jan 31 • 111
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling

Paper • 2502.06703 • Published Feb 10 • 142
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Paper • 2501.12370 • Published Jan 21 • 11

Muff collection

Kijai/HunyuanVideo_comfy

Updated 3 days ago • 9.25k • 351
IamCreateAI/Ruyi-Mini-7B

Image-to-Video • Updated Dec 25, 2024 • 2.29k • 604
Running on Zero

34

34

Falcon3 Demo

🦅

F3-DEMO
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Paper • 2412.10302 • Published Dec 13, 2024 • 17

Running on Zero

428

428

Chat with DeepSeek-VL2-small

🌍

Generate responses using images and text input
deepseek-ai/deepseek-vl2-tiny

Image-Text-to-Text • Updated Dec 18, 2024 • 81k • 165
deepseek-ai/deepseek-vl2-small

Image-Text-to-Text • Updated Dec 18, 2024 • 41.1k • 148
deepseek-ai/deepseek-vl2

Image-Text-to-Text • Updated Dec 18, 2024 • 16.6k • 301

LLM Tech Report

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 352
Qwen2.5-Coder Technical Report

Paper • 2409.12186 • Published Sep 18, 2024 • 141
Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement

Paper • 2409.12122 • Published Sep 18, 2024 • 3
Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 22 days ago • 163

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 180
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training

Paper • 2401.00849 • Published Jan 1, 2024 • 17
LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents

Paper • 2311.05437 • Published Nov 9, 2023 • 50
LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing

Paper • 2311.00571 • Published Nov 1, 2023 • 41

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs