Discriminative Reranking based Video Object Retrieval


For the instance search task, we are given a set of query images with the corresponding textual meta-data and objects masks to retrieve video shots containing query objects from FLICKR video database. We extract meaningful regions in the key-frames using Maximally Stable Extremal Regions (MSER) and use SIFT descriptors for representation. We use standard Bag of visual Word (BoW) model to represent database images. Additionally, we crawled training images for each query topic using the textual meta-data from Google and FLICKR images databases to train a discriminative classifier using Support Vector Machines (SVM). We use a discriminative model to rerank candidate images obtained by initial BoW search. The experimental results demonstrates the efficacy of the overall system. Finally, we highlight the need for domain adaptation when the source and target domains are completely different
Santhoshkumar Sunderrajan, Niloufar Pourian, Mahmudul Hasan, Yingying Zhu, B.S. Manjunath and Amit Roy Chowdhury ,
“Discriminative Reranking based Video Object Retrieval”,
Trecvid Technical Report, 2012.
Node ID: 597 , DB ID: 407 , Lab: VRL , Target: Proceedings