Discriminative Reranking based Video Object Retrieval
For the instance search task, we are given a set of query images with the corresponding textual meta-data and objects masks to retrieve video shots containing query objects from FLICKR video database. We extract meaningful regions in the key-frames using Maximally Stable Extremal Regions (MSER) and use SIFT descriptors for representation. We use standard Bag of visual Word (BoW) model to represent database images. Additionally, we crawled training images for each query topic using the textual meta-data from Google and FLICKR images databases to train a discriminative classiﬁer using Support Vector Machines (SVM). We use a discriminative model to rerank candidate images obtained by initial BoW search. The experimental results demonstrates the efﬁcacy of the overall system. Finally, we highlight the need for domain adaptation when the source and target domains are completely different
“Discriminative Reranking based Video Object Retrieval”,
Trecvid Technical Report, 2012.
Node ID: 597 , DB ID: 407 , Lab: VRL , Target: Proceedings