-
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data
Paper • 2405.14333 • Published • 40 -
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Paper • 2408.08152 • Published • 56 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 346 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 107
Collections
Discover the best community collections!
Collections including paper arxiv:2402.03300
-
Atla Selene Mini: A General Purpose Evaluation Model
Paper • 2501.17195 • Published • 33 -
DeepSeek-V3 Technical Report
Paper • 2412.19437 • Published • 55 -
Optimizing Large Language Model Training Using FP4 Quantization
Paper • 2501.17116 • Published • 36 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 107
-
Reasoning Language Models: A Blueprint
Paper • 2501.11223 • Published • 32 -
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 107 -
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
Paper • 2501.12948 • Published • 346
-
Evolving Deeper LLM Thinking
Paper • 2501.09891 • Published • 106 -
ProcessBench: Identifying Process Errors in Mathematical Reasoning
Paper • 2412.06559 • Published • 80 -
AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling
Paper • 2412.15084 • Published • 13 -
The Lessons of Developing Process Reward Models in Mathematical Reasoning
Paper • 2501.07301 • Published • 92
-
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Paper • 2402.03300 • Published • 107 -
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training
Paper • 2501.17161 • Published • 108 -
2.24k
The Ultra-Scale Playbook
🌌The ultimate guide to training LLM on large GPU Clusters
-
184
LLM训练终极指南 | The Ultra-Scale Playbook
🔥了解LLM训练的方方面面
-
deepseek-ai/DeepSeek-V3-Base
Updated • 762k • 1.59k -
TransMLA: Multi-head Latent Attention Is All You Need
Paper • 2502.07864 • Published • 47 -
2
Qwen2.5 Bakeneko 32b Instruct Awq
⚡Generate text-based responses for chat interactions
-
2
Deepseek R1 Distill Qwen2.5 Bakeneko 32b Awq
⚡Generate detailed responses based on user queries
-
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Paper • 2405.04434 • Published • 18 -
Titans: Learning to Memorize at Test Time
Paper • 2501.00663 • Published • 21 -
Transformer^2: Self-adaptive LLMs
Paper • 2501.06252 • Published • 53 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 142