出版物

(2026). AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
(2026). RIVER: Real-time Video Interaction Benchmark. The Fourteenth International Conference on Learning Representations.
(2026). PixNerd: Pixel Neural Field Diffusion. The Fourteenth International Conference on Learning Representations.