AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs

2026年5月5日·

Lidong Lu

,

Guo Chen

,

Zhu Wei

,

Zhiqi Li

,

Yicheng Liu

Tong Lu

Tong Lu

· 0 分钟阅读时长

引用 URL

类型

出版物

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

最近更新于 2026年5月5日

Tong Lu

Authors

← Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis and Interpretation 2026年5月8日

Bayesian Decomposition and Semantic Completion for Few-shot Semantic Segmentation 2026年5月5日 →