Digital multimedia information is becoming widely distributed though broadcast transmission, such as digital television signals, and interactive transmission, such as the Internet. The information may be in still images, audio feeds, or video data streams. However, the availability of such a large volume of information has led to difficulties in identifying content that is of particular interest to a user. Various organizations have attempted to deal with the problem by providing a description of the information that can be used to search, filter and/or browse to locate particular content. The Moving Picture Experts Group (MPEG) has promulgated a Multimedia Content Description Interface standard, commonly referred to as MPEG-7, to standardize the content descriptions for multimedia information. In contrast to preceding MPEG standards such as MPEG-1 and MPEG-2, which define coded representations of audio-visual content, an MPEG-7 content description describes the structure and semantics of the content and not the content itself.
Using a movie as an example, a corresponding MPEG-7 content description would contain “descriptors” (D), which are components that describe the features of the movie, such as titles for scenes, shots within scenes, time, color, shape, motion, and audio information for the shots. The content description would also contain one or more “description schemes” (DS), which are components that describe relationships among two or more descriptors and/or description schemes, such as a shot description scheme that relates together the features of a shot. A description scheme can also describe the relationship among other description schemes, and between description schemes and descriptors, such as a scene description scheme that relates the different shots in a scene, and relates the title feature of the scene to the shots.
MPEG-7 uses a Data Definition Language (DDL) that specifies the language for defining the standard set of description tools (DS, D) and for defining new description tools and provides a core set of descriptors and description schemes. The DDL definitions for a set of descriptors and description schemes are organized into “schemas” for different classes of content. The DDL definition for each descriptor in a schema specifies the syntax and semantics of the corresponding feature. The DDL definition for each description scheme in a schema specifies the structure and semantics of the relationships among its children components, the descriptors and description schemes. The DDL may be used to modify and extend the existing description schemes and create new description schemes and descriptors.
The MPEG-7 DDL is based on XML (extensible markup language) and the XML Schema standards. The descriptors, description schemes, semantics, syntax, and structures are represented with XML elements and XML attributes. Some of the XML elements and attributes may be optional.
The MPEG-7 content description for a particular piece of content is defined as an instance of an MPEG-7 schema; that is, it contains data that adheres to the syntax and semantics defined in the schema. The content description is encoded in an “instance document” that references the appropriate schema. The instance document contains a set of “descriptor values” for the required elements and attributes defined in the schema, and for any necessary optional elements and/or attributes. For example, some of the descriptor values for a particular movie might specify that the movie has three scenes, with scene one having six shots, scene two having five shots, and scene three having ten shots. The instance document may be encoded in a textual format using XML, or in a binary format, such as the binary format specified for MPEG-7 data, known as “BiM,” or a mixture of the two formats.
The instance document is transmitted through a communication channel, such as a computer network, to another system that uses the content description data contained in the instance document to search, filter and/or browse the corresponding content data stream. Typically, the instance document is compressed for faster transmission. An encoder component may both encode and compress the instance document or the functions may be performed by different components. Furthermore, the instance document may be generated by one system and subsequently transmitted by a different system. A corresponding decoder component at the receiving system uses the referenced schema to decode the instance document. The schema may be transmitted to the decoder separately from the instance document, as part of the same transmission, or obtained by the receiving system from another source. Alternatively, certain schemas may be incorporated into the decoder.
Although compression can reduce transmission time by decreasing the size of the instance document, if the description is large, transmitting the entire content description over a network can still take too much time. Therefore, only portions of the instance document may be transmitted to conserve bandwidth. In general, a content description can be modeled as a tree that is composed of a set of sub-trees or fragments. The determination of which fragments to send is application dependent.
A content description may be updated by adding, deleting or replacing description fragments, i.e., descriptors and description schemes, and/or attributes within fragments. The updates are transmitted to the receiving system through a series of packets, or “access units” in the MPEG-7 standard, that contain one or more fragment update units. The decoder on the receiving system updates its existing content description by applying the information in the fragment update units. Typically a fragment update unit consists of a navigation path that directs the decoder to the appropriate locations in the description tree to apply the update, an update command that specifies the type of update to execute, i.e., add, delete, replace, and a fragment payload that identifies the update value for an add or replace command. Because every current update command must specify the correct path to the update locations, the encoder must have prior knowledge of the description tree stored in the decoder before creating and transmitting the fragment update units. Thus, the current fragment update units can only construct the description tree at the decoder from the top down.