GDPO Author

Researcher and author of Group reward-Decoupled Normalization Policy Optimization

San Francisco, USA

About GDPO

The author is a researcher with expertise in reinforcement learning and policy optimization. They have published several papers on the topic and are known for their work on developing new methods for improving the performance of reinforcement learning algorithms. The author is also an expert in the field of multi-reward reinforcement learning and has worked on various projects involving the development of new algorithms and techniques for this area. They are active on Twitter and enjoy engaging with their audience.

Associated Podcasts

Seventy3

Guest

Connect

Key Topics

technology education science

Are you GDPO Author? Manage your profile and connect with your audience.

Claim Profile

Recent Appearances

【第509期】GDPO：多奖励强化学习的解耦归一化策略优化

Seventy3•Feb 20, 2026•15:25•guest