Very little work exists (at least, up to 2002) in the literature concerning mining multimedia data. The exponential growth of such data in consumer as well as scientific applications pose many interesting and task critical challenges. There are several inter-related issues in the management of such data, including feature extraction, similarity based search, high dimensional indexing, scalability to large data sets, and personalizing search and retrieval.
Nowadays, capabilities for storing data are far larger that the capabilities for data analysis and understanding (data warehousing). The idea of discovering “knowledge” in such large datasets (or mine data) may be significantly challenging. Generally speaking, knowledge discovery or data mining in databases is a nontrivial extraction of previously unknown and potentially useful information from data (Frawley, 1992). Data mining is essentially the computer-assisted process of information analysis (Fayyad 1996). The data mining process seeks to build a better understanding and characterization of data useful for further analysis. Data Mining techniques unifies existing methods from machine learning, pattern recognition, databases, statistics, data visualization, etc.
Data Mining primarily addresses the issues connected to the vast amount of data available nowadays. The eruption of data has caused a comparable explosion in the need to analyze it. A great increase in the computational power has made data mining techniques, which might at one time have been too computationally expensive, quite possible. The since the stored data are more complex, the number of fields associated with one record in the database is increased. Higher dimensionality of the dataset introduces “dimensionality curse” to the existing methods (e.g. clustering, decision trees, classification, statistical and learning methods).
Back to Projects.