The present invention relates to audio visual information systems, and more specifically to a system for describing, classifying, and retrieving audiovisual information for semantic descriptions of audiovisual information.
The amount of multimedia content available on the World Wide Web and in numerous other databases is growing out of control. However, the enthusiasm for developing multimedia content has led to increasing difficulties in managing accessing and identifying and such content mostly due to their volume. Further more, complexity and a lack of adequate indexing standards are problematic. To address this problem, MPEG-7 is being developed by the Moving Pictures Expert Group (MPEG), which is a working group of ISO/IEC. In contrast to preceding MPEG standards such as MPEG-1 and MPEG-2 which relate to coded representation of audio-visual content, MPEG-7 is directed to representing information relating to content, and not the content itself.
The MPEG-7 standard, formally called the “Multimedia Content Description Interface” seeks to provide a rich set of standardized tools for describing multimedia content. It is the objective to provide a single standard for providing interoperable, simple and flexible solutions to the aforementioned problems vis-à-vis indexing, searching and retrieving multimedia content. It is anticipated that software and hardware systems for efficiently generating and interpreting MPEG-7 descriptions will be developed.
More specifically, MPEG-7 defines and standardizes the following: (1) a core set of Descriptors (Ds) for describing the various features of multimedia content; (2) Description Schemes (DSs) which are pre-defined structures of Descriptors and their relationships; and (3) a Description Definition Language (DDL) for defining Description Schemes and Descriptors.
A Descriptor (D) defines both the semantics and the syntax for representing a particular feature of audiovisual content. A feature is a distinctive characteristic of the data which is of significance to a user. . . .
As noted, DSs are pre-defined structures of Descriptors and their relationships. Specifically, the DS sets forth the structure and semantics of the relationships between its components having either Descriptors and/or Description Schemes. To describe audiovisual content, a concept known as syntactic structure which specifies the physical and logical structure of audiovisual content is utilized.
The Description Definition Language (DDL) is the language that allows the creation of new Description Schemes and Descriptors. It also allows the extension and modification of existing Description Schemes. The DDL has to be able to express spatial, temporal, structural, and conceptual relationships between the elements of a DS, and between DSs.
DS (Description Schemes)
Among other DSs, the DS comprising Semantic DS are as follows.
(1) STime, which deals with semantic time descriptions. A semantic time description may be written without reference to any time standard, for instance “at Chrissy's birthday party, last year” is a reasonable semantic description of a time.
(2) SLocation which deals with semantic place descriptions. The same model applies to semantic locations as semantic times, for instance, “down the street” is a valid (if somewhat vague) semantic place.
(3) MediaLocator which connects the description to a media.
(4) MediaOccurrence: This DS is a lightweight segment, as annotation is lightweight semantic.
(5) AnalyticModel: This DS allows the use of non-verbal material in construction of descriptions.
(6) Object: This DS describes objects occurring in a media.
(7) Event: This DS describes events occurring in a media.
(8) SemanticDescription: This DS encapsulates a complete description of a narrative world. The concept of a narrative world is somewhat intuitive, it is a context plus the necessary objects and events to describe a situation that could be a movie, or a scene, or a shot, or it could also represent a situation that is described secondarily, in aiding the current description. Although such a scene, or narrative world may have multiple descriptions, each of these is handled by a single Semantic DS.
(9) Concept: This DS is an abstraction tool, that looks like semantic description.
(10) SemanticGraph: This DS is a graph of the relations between the DS in semantic descriptions.
(11) State: A bundle of attribute value pairs which allow the specification of parameter values at an instant of time or at a particular location.
(12) UsageDescription: A boolean indicating the purpose of a description, that is, whether it is intended as description or as an indexing element. There are other DS, for instance, for each DS within SemanticDescription that has access to media, as well as for the graph, there are counterparts within Concept.
(13) Semantic DS. This is used to hold one or several SemanticDescriptions or Concepts, or both. Further, abstract descriptions, in the form of Concepts, are stored in Classification Schemes, as part of the description of controlled terms.
Conventionally, these DSs are employed for describing semantic relationships that occur. When a new relationship is found, DSs are added to accommodate the new relationships. Disadvantageously, it is unclear whether the new DSs can support the new semantic relationships until some experimentation is carried out. Moreover, conventional techniques have limited expressive power for describing arbitrary structures.
Therefore there is a need to resolve the aforementioned problems and the present invention meets this need.