Balancing the Experts: Unlocking LoRA-MoE for GRPO via Mechanism-Aware Rewards

Apr 24, 2026·
Changlian Ma
,
Zizheng Huang
,
Xiangyu Zeng
,
Yi Wang
,
Cheng Liang
,
Kun Tian
,
Xinhai Zhao
Limin Wang
Limin Wang
· 0 min read
Type
Publication
The Fourteenth International Conference on Learning Representations