B. Sumengen, M. Hol, F.E. Kalsbeek, Ullrich Moenich, Till Quack, Lars Thiele and B.S.Manjunath
The goal of this project is to take content-based image-retrieval one step further in size and closer to real world applications. The system handles over 10 Million images to date, and the collection is still growing.
Recent advances in processing and networking capabilities of computers have led to an accumulation of immense amounts of multimedia data such as images. One of the largest repositories for such data is the World Wide Web (WWW). We present Cortina, a large-scale image retrieval system for the WWW. It handles over 10 Million images to date. The system retrieves images based on visual features and collateral text. We show that a search process which consists of an initial query-by-keyword or query-by-image and followed by relevance feedback on the visual appearance of the results is possible for large-scale data sets. We also show that it is superior to the pure text retrieval commonly used in large-scale systems. Semantic relationships in the data are explored and exploited by data mining, and multiple feature spaces are included in the search process.