For purposes of this description, unless otherwise stated or implied from context, the following definitions are made. By media stream we mean data consisting of one media type. Examples are video or audio data. For simplicity, the adjective "media" is sometimes left out when it is clear from context. By media presentation (or multimedia presentation), we mean a combination of one or more media streams where the different media streams have a rendering relationship during playback. By elementary stream type, we mean a media stream consisting of data which can be rendered in the absence of any other media. The base layer of a layered video stream is an elementary stream type, while enhancement layers which depend upon the base layer are not elementary types. By coupled stream type, we mean a media stream consisting of data which cannot be rendered in the absence of other specific media streams. A common type of coupling may be due to hierarchical encoding in which enhancement layers require either the base layer or other enhancement layers. Another example of a coupled stream type may be a media stream which describes transforms to be performed on some other elementary stream type.
A common protocol for the transmission of media streams is the real time Transport (RTP) protocol. See Internet Engineering Task Force (IETF) RFC 1889. Currently, the relationships among a set of media streams comprising most media presentations are static. For example, a presentation will often consist of a single video stream and an associated audio track for the entire duration of the media presentation. For static media presentations, the Session Description Protocol (SDP) serves to statically describe the relationships among a set of media streams comprising a static media presentation. See IETF Internet Draft, "draft-ietf-mmusic-sdp.04.txt".
It is anticipated that there will be greater demand in the future for dynamic media presentations. For a dynamic media presentation, the set of media streams and their dependencies comprising the presentation will change during the time duration of the presentation. For example, where a media presentation may start as a single video stream and an associated audio stream, it may later change to multiple layers of video streams to enable client scalability, or additional audio streams may be added to the presentation at a client's request to provide higher fidelity.
It would therefore be advantageous to provide a method of communication (or association) to dynamically describe the relationships (or dependencies) among the components comprising a media presentation, thereby allowing the composition of a media presentation to be varied in response to information that becomes available as the presentation progresses. Furthermore, it would be advantageous that any such method for describing a dynamic media presentation be orthogonal to the actual type of information each media stream contains. In addition, it would be advantageous to provide a method of communication (or association) whereby some components of a presentation are dynamically specified as required and others are dynamically specified as optional. Such a dynamic description would allow, for instance, all clients to receive a base level audio stream and a base level video stream so as to render the core of a presentation, while allowing additional optional components to be received by those clients that have the processing or bandwidth capability to utilize them. Such dynamic descriptions of media stream associations are not provided by SDP.