Video Annotation Through Search and Graph Reinforcement Mining

Emily Moxley (1), Tao Mei (2), and B. S. Manjunath (1)
(1) Department of Electrical and Computer Engineering
University of California, Santa Barbara,
CA 93106
{emoxley, manj} [at] ece.ucsb.edu

(2) Microsoft Research Asia, Beijing, China
tmei [at] microsoft.com

Abstract

Unlimited vocabulary annotation of multimedia documents remains elusive despite progress solving the problem in the case of a small, fixed lexicon. Taking advantage of the repetitive nature of modern information and online media databases with independent annotation instances, we present an approach to automatically annotate multimedia documents that uses mining techniques to discover new annotations from similar documents and to filter existing incorrect annotations. The annotation set is not limited to words that have training data or for which models have been created. It is limited only by thewords in the collective annotation vocabulary of all the database documents. A graph reinforcement method driven by a particular modality (e.g., visual) is used to determine the contribution of a similar document to the annotation target. The graph supplies possible annotations of a different modality (e.g., text) that can be mined for annotations of the target. Experiments are performed using videos crawled from YouTube. A customized precision-recall metric shows that the annotations obtained using the proposed method are superior to those originally existing for the document. These extended, filtered tags are also superior to a state-of-the-art semi-supervised technique for graph reinforcement learning on the initial user-supplied annotations.
[PDF] [BibTex]
E. Moxley, T. Mei and B. S. Manjunath,
IEEE Transactions on Multimedia, vol. 12, no. 3, pp. 184-193, Apr. 2010.
Node ID: 543 , DB ID: 351 , Lab: VRL , Target: Journal
Subject: [Managing Multimedia Databases] « Look up more