Content streaming, such as the streaming of audio, video, text, or other media, indicates that data representing the content is provided over a network to a client computer on an as-needed basis rather than being pre-delivered in its entirety before playback. Thus, the client computer renders streaming data as it is received from a network server, rather than waiting for an entire file to be delivered.
The widespread availability of streaming multimedia enables a variety of informational content that was not previously available over the Internet or other computer networks. Live content is one significant example of such content. Using streaming multimedia, audio, video, or audio/visual coverage of noteworthy events can be broadcast over the Internet as the events unfold. Similarly, television and radio stations can transmit their live content over the Internet. Content streaming can be implemented with one or more protocols.
For example, the Real-time Transport Protocol (RTP), as described in the Internet Engineering Task Force (IETF) RFC 1889, the entire disclosure of which is incorporated herein by reference, provides end-to-end network transport functions suitable for applications transmitting real-time data, such as audio, video or simulation data, over multicast or unicast network services. RTP does not address resource reservation and does not guarantee quality-of-service for real-time services. The data transport is augmented by a control protocol (RTCP) to allow monitoring of the data delivery in a manner scalable to large multicast networks, and to provide minimal control and identification functionality. RTP and RTCP are designed to be independent of the underlying transport and network layers.
In addition, the Real-time Streaming Protocol (RTSP), as described in the IETF RFC 2326, the entire disclosure of which is incorporated herein by reference, is an application-level protocol for control of the delivery of data with real-time properties. RTSP provides an extensible framework to enable controlled, on-demand delivery of real-time data, such as audio and video. Sources of data can include both live data feeds and stored clips. This protocol is intended to control multiple data delivery sessions, provide a means for choosing delivery channels such as user datagram protocol (UDP), multicast UDP and transmission control protocol (TCP), and provide a means for choosing delivery mechanisms based upon RTP.
Further, the Session Description Protocol (SDP), as described in the IETF RFC 2327, the entire disclosure of which is incorporated herein by reference, is an application level protocol intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. SDP can be used in conjunction with RTSP to describe and negotiate properties of the multimedia session used for delivery of real-time data.
A multimedia encoder can capture real-time audio and video data and represent the captured data as multiple streams. For example, audio is typically represented as one stream and video as another. Complex files can have multiple streams, some of which may be mutually exclusive. RTSP specifies a mechanism by which a client can ask a server to deliver one or more of the encoded media streams. RTSP also provides a way for the client to obtain information about the contents of the multimedia presentation via SDP message format prior to delivery of the multimedia. SDP enumerates the available media streams and lists a limited set of auxiliary information (“SDP metadata”) that is associated with the collection of streams.
However, SDP is not able to express complex relationships between streams in part because SDP only defines a limited set of SDP metadata items. In addition, SDP does not have a notion of mutually exclusive streams. For example, SDP lacks support for specifying SDP metadata in multiple languages in a single SDP message. As such, SDP fails to adequately describe content encoded in certain formats.
For example, some multimedia encoders capture real-time audio and video data and save the content as advanced streaming format (ASF) file (also referred to as active streaming format or advanced system format) as disclosed in U.S. Pat. No. 6,041,345. ASF is a file format specification for streaming multimedia files containing text, graphics, sound, video, and animation. An ASF file has objects including a header object containing information about the file, a data object containing the media streams (i.e., the captured audio and video data), and an optional index object that can help support random access to data within the file. The header object of an ASF file stores information as metadata that is needed by a client to decode and render the captured data. The list of streams and their relationships to each other is also stored in the header object of the ASF file. Some of the metadata items may be mutually exclusive because the metadata items describe the same information using different spoken languages. SDP fails to adequately describe content encoded in ASF.
For these reasons, a system and method for embedding a streaming media format header within a session description message describing content in a streaming media session is desired to address one or more of these and other disadvantages.