Yihua Zhang

NormalUhr

AI & ML interests

None yet

Recent Activity

Organizations

OPTML Group @ MSU's profile picture

NormalUhr's activity

published an article about 6 hours ago
view article
Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

By NormalUhr
published an article 4 days ago
view article
Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

By NormalUhr
19
published an article 7 days ago
view article
Article

A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons

By NormalUhr
2
published an article 7 days ago
view article
Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

By NormalUhr
6
published an article 7 days ago
view article
Article

MLA: Redefining KV-Cache Through Low-Rank Projections and On-Demand Decompression

By NormalUhr
4