出版物

(2026). VMonarch: Efficient Video Diffusion Transformers with Structured Attention. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
(2026). TimeLens: Rethinking Video Temporal Grounding with Multimodal LLMs. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
(2026). Rethinking BCE Loss for Multi-Label Image Recognition with Fine-Tuning. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
(2026). InternVideo-Next: Towards World-Understanding Video Models. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
(2026). DDT: Decoupled Diffusion Transformer. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
(2026). Bayesian Decomposition and Semantic Completion for Few-shot Semantic Segmentation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).