Publications

Home

Publications

Multi-view Video Summarization pdf
Yanwei Fu, Yanwen Guo*, Yanshu Zhu, Feng Liu, Chuanming Song and Zhi-Hua Zhou
IEEE Trans. on Multimedia 2010,12(7), Regular Paper (IEEE TMM, SCI, EI, Corresponding Author)

Previous video summarization studies focused on monocular videos, and the results would not be good if they were applied to multi-view videos directly, due to problems such as the redundancy in multiple views. In this paper, we present a method for summarizing multi-view videos. We construct a spatio-temporal shot graph and formulate the summarization problem as a graph labeling task. The spatio-temporal shot graph is derived from a hypergraph, which encodes the correlations with different attributes among multi-view video shots in hyperedges. We then partition the shot graph and identify clusters of event-centered shots with similar contents via random walks. The summarization result is generated through solving a multi-objective optimization problem based on shot importance evaluated using a Gaussian entropy fusion scheme. Different summarization objectives, such as minimum summary length and maximum information coverage, can be accomplished in the framework. Moreover, multi-level summarization can be achieved easily by configuring the optimization parameters. We also propose the multi-view storyboard and event board for presenting multi-view summaries. The storyboard naturally reflects correlations among multi-view summarized shots that describe the same important event. The event-board serially assembles event-centered multi-view shots in temporal order. Single video summary which facilitates quick browsing of the summarized multi-view video can be easily generated based on the event board representation.

Discriminative Nonorthogonal Binary Subspace Tracking pdf
Ang Li, Feng Tang, Yanwen Guo*, and Hai Tao
I Proc. 11th European Conference on Computer Vision (ECCV 2010)

Visual tracking is one of the central problems in computer vision. A crucial problem of tracking is how to represent the object. Traditional appearance-based trackers are using increasingly more complex features in order to be robust. However, complex representations typically will not only require more computation for feature extraction, but also make the state inference complicated. In this paper, we show that with a careful feature selection scheme, extremely simple yet discriminative features can be used for robust object tracking. The central component of the proposed method is a succinct and discriminative representation of image template using discriminative non-orthogonal binary subspace spanned by Haar-like features. These Haar-like bases are selected from the over-complete dictionary using a variation of the OOMP (optimized orthogonal matching pursuit). Such a representation inherits the merits of original NBS in that it can be used to efficiently describe the object. It also incorporates the discriminative information to distinguish the foreground and background. We apply the discriminative NBS to object tracking through SSD-based template matching. An update scheme of the discriminative NBS is devised in order to accommodate object appearance changes. We validate the effectiveness of our method through extensive experiments on challenging videos and demonstrate its capability to track objects in clutter and moving background.