Balancing the Experts: Unlocking LoRA-MoE for GRPO via Mechanism-Aware Rewards Apr 24, 2026· Changlian Ma , Zizheng Huang , Xiangyu Zeng , Yi Wang , Cheng Liang , Kun Tian , Xinhai Zhao Limin Wang · 0 min read Cite URL Type Conference paper Publication The Fourteenth International Conference on Learning Representations Last updated on Apr 24, 2026 Authors Limin Wang Nanjing University ← Arbitrary Generative Video Interpolation Apr 24, 2026 CaReBench: A Fine-grained Benchmark for Video Captioning and Retrieval Apr 24, 2026 →