1. Technical Field
The present invention is generally directed to an improved data processing system. More specifically, the present invention is directed to an improved data processing system in which media content is classified using an ontology-based classification mechanism such that media indices may be generated for use in modeling and/or retrieving the media content.
2. Description of Related Art
The growing amounts and importance of digital video data are driving the need for more complex techniques and systems for video and multimedia indexing. Multimedia indexing is the process by which multimedia content is classified into one or more classifications based on an analysis of the content. As a result of this classification, various multimedia indices are associated with the multimedia content. These indices may then be stored for later use in modeling the multimedia content, retrieving the multimedia content in response to search queries, or the like.
Some recent techniques for multimedia indexing include extracting rich audio-visual feature descriptors, classifying multimedia content and detecting concepts using statistical models, extracting and indexing speech information, and so forth. While progress continues to be made on these approaches to develop more effective and efficient techniques, the challenge remains to integrate this information together to effectively answer user queries of multimedia repositories. There are a number of approaches for multimedia database access, which include search methods based on the above extracted information, as well as techniques for browsing, clustering, visualization, and so forth. Each approach provides an important capability.
One methodology for multimedia indexing involves classification of multimedia content into a plurality of classifications, i.e. multiple classification. In the multiple classification area, each portion of multimedia input data is mapped into one or several possible concepts. For example, a “studio-setting” image or video can have two “newsperson” “sitting” behind a “desk”. This individual image or video content contains four concepts: “studio-setting”, “newsperson”, “sitting” and “desk”. For each concept, an associated classifier is developed for determining how to classify image or video data into that particular concept. This development is typically performed through a specific learning process. Thus, once the classifier has been developed, given unlabeled video or image data, the classifier can determine whether this shot contains the corresponding semantic concept, e.g., “studio-setting,” “newsperson,” or “desk.” The classification methodology of learning classifiers is essential to various multimedia applications, such as multimedia content abstraction, multimedia content modeling, multimedia content retrieval, and the like.
However, the learned classifiers, especially those whose semantic coverage is very restrictive, are usually unreliable. This is most typically due to an under-representative training data set used to develop the classifiers, and imbalanced ratio of positive and negative training data, and other factors. Quite a few previous efforts have been made in the direction of improving individual classifiers based on other classifiers in the multiple classification area. However, these previous approaches lack the capability to improve the accuracy of individual classifiers from the reliable classifiers by studying the ontology structure. Unfortunately, taking influence from unreliable classifiers makes the system vulnerable to becoming unstable.
For example, the system described in U.S. Pat. No. 6,233,575, entitled “Multilevel Taxonomy Based On Features Derived from Training Documents Classification Using Fisher Values as Descrimination Values,” issued on May 15, 2001, which is hereby incorporated by reference, organizes multiple concepts into a hierarchical decision tree. Each node represents one concept classifier. Classification decisions are made from a top-down traversal of the decision tree. However, concept classification is restricted in a sub-tree of the decision tree. This sacrifices global information and an error decision made in a top level of the decision tree may be propagated and accumulated in later sub-tree classifications.
Reclassification models take classification outputs from single concept models as new features and then perform reclassification in order to improve performance of the classification model. The assumption behind reclassification is that points close in the feature space tend to produce similar outputs operated by these classifiers. However, the assumption of high-correlation classification outputs between various concepts might not be true. Furthermore, taking influence from unreliable classifiers would make the system vulnerable to becoming unstable.