AV-Reasoner: Improving and Benchmarking Clue-Grounded Audio-Visual Counting for MLLMs

May 5, 2026·

Lidong Lu

,

Guo Chen

,

Zhu Wei

,

Zhiqi Li

,

Yicheng Liu

Tong Lu

Tong Lu

· 0 min read

Cite URL

Type

Conference paper

Publication

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Last updated on May 5, 2026

Tong Lu

Authors

Nanjing University

← Understanding New-Knowledge-Induced Factual Hallucinations in LLMs: Analysis and Interpretation May 8, 2026

Bayesian Decomposition and Semantic Completion for Few-shot Semantic Segmentation May 5, 2026 →