Nemotron-Personas: Improve AI Training With the First Synthetic Personas Dataset Aligned to Real-World Distributions By nvidia and 1 other • 3 days ago • 10
Saying Thank You to a LLM Isn't Free — Measuring the Energy Cost of Politeness By jdelavande and 2 others • 1 day ago • 6
Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H By Hcompany and 1 other • 10 days ago • 65
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 152
Nemotron-Personas: Improve AI Training With the First Synthetic Personas Dataset Aligned to Real-World Distributions By nvidia and 1 other • 3 days ago • 10
Saying Thank You to a LLM Isn't Free — Measuring the Energy Cost of Politeness By jdelavande and 2 others • 1 day ago • 6
Holo1: New family of GUI automation VLMs powering GUI agent Surfer-H By Hcompany and 1 other • 10 days ago • 65
DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge By NormalUhr • Feb 7 • 152