view article Article Model2Vec: Distill a Small Fast Model from any Sentence Transformer By Pringled and 1 other • Oct 14, 2024 • 69
Tulu 3 Models Collection All models released with Tulu 3 -- state of the art open post-training recipes. • 10 items • Updated 1 day ago • 90
Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published 14 days ago • 51
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 28 days ago • 142
Qwen2.5-VL Collection Vision-language model series based on Qwen2.5 • 3 items • Updated 16 days ago • 337
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published 21 days ago • 315
Critical Tokens Matter: Token-Level Contrastive Estimation Enhence LLM's Reasoning Capability Paper • 2411.19943 • Published Nov 29, 2024 • 57
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation Paper • 2412.02592 • Published Dec 3, 2024 • 22
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking Paper • 2501.04519 • Published Jan 8 • 255
LLM4SR: A Survey on Large Language Models for Scientific Research Paper • 2501.04306 • Published Jan 8 • 33
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token Paper • 2501.03895 • Published Jan 7 • 48
Personalized Graph-Based Retrieval for Large Language Models Paper • 2501.02157 • Published Jan 4 • 28
OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System Paper • 2412.20005 • Published Dec 28, 2024 • 17