Multicamera video summarization from optimal reconstruction


We propose a principled approach to video summarization using optimal reconstruction as a metric to guide the creation of the summary output. The spatio-temporal video patches included in the summary are viewed as observations about the local motion of the original input video and are chosen to minimize the reconstruction error of the missing observations under a set of learned predictive models. The method is demonstrated using fixed-viewpoint video sequences and shown to generalize to multiple camera systems with disjoint views, which can share activity already summarized in one view to inform the summary of another. The results show that this approach can significantly reduce or even eliminate the inclusion of patches in the summary that contain activities from the video that are already expected based on other summary patches, leading to a more concise output.
[PDF] [BibTex]
Carter de Leo and B.S. Manjunath,
Tenth Asian Conference on Computer Vision, Queenstown, New Zealand, Nov. 2010.
Node ID: 580 , DB ID: 389 , Lab: VRL , Target: Conference