view article Article From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning By NormalUhr • 7 days ago • 6
view article Article Fine-tune Deepseek-R1 with a Synthetic Reasoning Dataset By sdiazlor • 1 day ago • 21
Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning Paper • 2502.03275 • Published 6 days ago • 11
The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles Paper • 2502.01081 • Published 9 days ago • 12
Hibiki fr-en Collection Hibiki is a model for streaming speech translation , which can run on device! See https://github.com/kyutai-labs/hibiki. • 5 items • Updated 5 days ago • 45
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published 7 days ago • 154
view article Article Fine-tune ModernBERT for RAG with Synthetic Data By sdiazlor and 2 others • 22 days ago • 35
Explaining Large Language Models Decisions Using Shapley Values Paper • 2404.01332 • Published Mar 29, 2024 • 1
Preference Leakage: A Contamination Problem in LLM-as-a-judge Paper • 2502.01534 • Published 8 days ago • 34
Model2Vec base models Collection These are the Minishlab Model2Vec base models. Load them and use them with model2vec (https://github.com/MinishLab/model2vec) or sentence-transformers • 9 items • Updated 5 days ago • 9
POTION Collection These are the flagship POTION models. Load them and use them with model2vec (https://github.com/MinishLab/model2vec) or sentence-transformers • 5 items • Updated 9 days ago • 10