There is no such thing as a tokenizer-free lunch
By
•
•
69Model Quality: Hugging Face Is All You Need
By
•
•
20ModernVBERT: Towards Smaller Visual Document Retrievers
By
and 4 others
•
•
16CU-1 for Autonomous UI Agent Systems: An Open Alternative to Proprietary Solutions
By
•
•
12Code a simple RAG from scratch
By
•
•
211When Does Reasoning Matter? Unpacking the Contribution of Reasoning to LLM Performance
By
and 1 other
•
•
10How I Trained Action Chunking Transformer (ACT) on SO-101: My Journey, Gotchas, and Lessons
By
•
•
9Preserving Agency: Why AI Safety Needs Community, Not Corporate Control
By
•
•
9Uncensor any LLM with abliteration
By
•
•
685RexBERT: Encoders for a brave new world of E-Commerce
By
and 1 other
•
•
46Gaia2 Leaderboard Update: New Models and New Observations
By
and 3 others
•
•
6How to Train an Antibody Developability Model
By
and 1 other
•
•
14Nemotron-Personas-Japan: Synthesized Data for Sovereign AI
By
and 6 others
•
•
25Nemotron-Personas-Japan: ソブリン AI のための合成データセット
By
and 6 others
•
•
7Introduction to State Space Models (SSM)
By
•
•
176arXiv实用技巧,如何让你的paper关注度变高?
By
•
•
14Fine-Tuning Your First Large Language Model (LLM) with PyTorch and Hugging Face
By
•
•
72DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge
By
•
•
226Small Language Models (SLM): A Comprehensive Overview
By
•
•
76From GRPO to DAPO and GSPO: What, Why, and How
By
•
•
34