view article Article Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset By sdiazlor • 1 day ago • 22
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 21 days ago • 315
RAFT: Adapting Language Model to Domain Specific RAG Paper • 2403.10131 • Published Mar 15, 2024 • 69
view article Article Fine-tune ModernBERT for RAG with Synthetic Data By sdiazlor and 2 others • 22 days ago • 35
Towards Best Practices for Open Datasets for LLM Training Paper • 2501.08365 • Published 28 days ago • 54
OpenRFT: Adapting Reasoning Foundation Model for Domain-specific Tasks with Reinforcement Fine-Tuning Paper • 2412.16849 • Published Dec 22, 2024 • 9
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 255
Understanding Chain-of-Thought in LLMs through Information Theory Paper • 2411.11984 • Published Nov 18, 2024 • 1
HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs Paper • 2412.18925 • Published Dec 25, 2024 • 97
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models Paper • 2501.01830 • Published Jan 3 • 18
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases Paper • 2402.14905 • Published Feb 22, 2024 • 126
MobileQuant: Mobile-friendly Quantization for On-device Language Models Paper • 2408.13933 • Published Aug 25, 2024 • 15