An important application of transmission networks like the Internet or mobile telephone networks is the media delivery from a server to a client. Media may be, for example, audio and video.
Media delivery in IP (Internet Protocol) based networks may use different transport protocols. Traditionally, either RTP (Real-time Transport Protocol) over UDP (User Datagram Protocol) is used for real-time streaming and packet-based streaming or HTTP (Hyper Text Transfer Protocol) over TCP (Transmission Control Protocol) for download of whole files, mostly for later consumption but also for life streaming. RTP allows for dynamic adaptation to available bit-rate as measured by the client. A drawback of RTP and the associated control protocol RTSP (Real-time Streaming Protocol) is the need for specialized and more complicated server software, while HTTP can use widely deployed and inexpensive HTTP server software. A recent development, Dynamic Adaptive HTTP Streaming (DASH), aims at combining the advantages of both approaches. DASH is standardized in 3GPP (Third Generation Partnership Project) Technical Specification (TS) 26.234 v 9.4.0 Transparent end-to-end Packet-switched Streaming Service (PSS), and also adopted and slightly extended in the Open IPTV Forum (OIPF) and MPEG (Moving Pictures Experts Group).
In DASH, the content (which is also denoted as media herein) is encoded in different versions, usually corresponding to different bit rates. If the content is for example a video with a video track and an audio track, the video track could be encoded in three versions with different bit rate each, and the audio track in a high-quality stereo and a mono version. Each version is further divided into segments of a few seconds duration. For example, the video versions can be divided into many consecutive segments of 10 seconds duration each. The segments may be formatted according to the MPEG-4 file format, or according to the MPEG-2 transport stream format.
The actual transmission of the video and audio tracks is performed by downloading one segment after the other initiated by the client. In this procedure the client downloads a segment using a standard HTTP request, unpacks, decodes, and renders it, and then performs the same for the next segment and so forth for further segments. The client has knowledge about the available quality versions, and about the segment separation over time by means of a media description, the so-called Media Presentation Description (MPD). The MPD format as defined in 3GPP TS 26.234, OIPF, and MPEG is an XML (eXtensible Markup Language) encoded file containing appropriate information and attributes to describe the media. The MPD is the first resource transmitted to a client in order to start a DASH based media delivery. In other words, the purpose of the MPD is to give location and timing information to the client to fetch and playback the media segments of a particular content.
The MPD consists of three major components, namely Periods, Representations and Segments. As depicted in FIG. 11, Period elements are the outermost part of the MPD. Periods are typically larger pieces of media that are played out sequentially. Inside a period, multiple different encodings of the content may occur. Each alternative of a period is called a Representation. These alternative Representations can have, for example, different bitrates, frame rates or video resolutions. Finally, each Representation describes a series of segments by media links, e.g. HTTP Uniform Resource Locators (URLs). Those URLs are either explicitly described in the Representation (similar to a playlist) or described through a template construction, which allows the client to derive a valid URL for each segment of a representation. Content play-list or advertisement-insertion functionality can easily be achieved by chaining periods of different content.
Each segment is downloaded at the maximum available speed under the present operation conditions of the network used for transmission and the client monitors the download speed it experiences. Based on the experienced download speed the client selects the most appropriate of the available quality versions. From segment to segment this may be a different version, and the client can download different qualities depending on the present operation conditions, hence the attribute “adaptive” HTTP streaming. FIG. 1 visualizes the principle and shows different media representations for adaptive HTTP streaming of a content item as a function of the playout time. The three representations in FIG. 1, i.e. “Representation 1”, “Representation 2”, and “Rep. 3”, may correspond to a high, medium and low bitrate representation, respectively, of a content item, i.e. stream. Begin and end of the playout time for the stream segments (a segment may be abbreviated as “Seg.”) of different representations coincide so that smooth switching between the representations is possible. The vertical scale in FIG. 1 illustrates the data size of the different stream representations, e.g. their bit rate. Depending on the client implementation, enhanced selection procedures are possible for switching between the representations, e.g. including a hysteresis in order to avoid excessive quality fluctuations when viewing or listening to a stream.
In the progress of the DASH session, the MPD may be updated at the HTTP server. Especially in the case of live streaming the MPD is usually updated on a regular basis, e.g., to add other content items such as advertisements to the media presentation. In particular, at least one of the Periods, Representations and Segments may be changed for updating an MPD. For example, an updated MPD may contain new/additional segments (to be rendered in the future), that were not included in the previous one, or remove old media segments that should have already been rendered by the clients. The update may also modify the number of available media qualities, e.g., media bitrates.
In order to get an updated MPD, the client must send a HTTP request to the HTTP server, to get the current MPD back by an HTTP response. In order to get properly informed on an MPD update, the UE may send frequent HTTP requests to the HTTP server, i.e., HTTP requests may be sent at a rate that is higher than an MPD update rate. In this case, the MPD is not updated on every HTTP request and the HTTP server responds with one or even more HTTP responses comprising the MPD that has been delivered already previously to the UE. Hence, unnecessary HTTP requests and responses are exchanged, which waste resources between the UE and HTTP server. This is especially harmful if the transmission path includes radio links of a mobile communication network.
On the other hand, if the UE sends HTTP requests for MPD updates infrequently, i.e., at a rate lower then the MPD update rate, the UE may lack the MPD update for a longer time, i.e. the time between MPD update and next HTTP response comprising the updated MPD. In the meantime, the UE can run out of media segments in its buffer and the media play-out is interrupted.
Another trend in multimedia communication is the usage of the IP Multimedia Subsystem (IMS) for the initiation and control of multimedia sessions. Within 3GPP, standardized solutions for IMS controlled RTP streaming as well as for IMS controlled HTTP progressive download are defined in 3GPP TS 26.237 V9.3.0 (2010-06) with the title IP Multimedia Subsystem (IMS) based Packet Switch Streaming (PSS) and Multimedia Broadcast/Multicast Service (MBMS) User Service; Protocols. These solutions benefit from the standardized features offered by IMS like charging, authentication or QoS (Quality of Service) reservation.
FIG. 2 shows the different signaling steps in case of IMS controlled HTTP progressive download as defined in defined in 3GPP TS 26.237. The session is initiated with a SIP (Session Initiation Protocol) INVITE message which includes SDP (Session Description Protocol) information. The HTTP URL (Uniform Resource Locator) for download is delivered to the user equipment (UE), i.e. client, via the SIP 200 OK message. In addition, a QoS reservation for the HTTP progressive download session may be carried out. The progressive download itself is initiated by the UE with a HTTP GET command towards the HTTP server, which in return responds with the requested content file. In more detail, the following steps are performed:                1. The UE initiates the progressive download session by sending SIP INVITE to the IM CN subsystem, including an SDP offer.        2. The IM CN subsystem forwards the SIP INVITE message to the SCF.        3. The SCF verifies the user rights for the requested content, selects an HTTP/SIP adapter, and forwards the SIP INVITE message to the HTTP/SIP adapter.        4. The HTTP/SIP adapter selects an HTTP Server, and sends an HTTP POST message to the HTTP server, including the IP address of the UE.        5. The HTTP server answers to the HTTP/SIP adapter with a HTTP 200 OK response.        6. The HTTP/SIP adapter sends the SIP 200 OK answer to the SCF, including download URL of the requested content file in the SDP answer.        7. The SCF forwards the SIP 200 OK to the IM CN subsystem.        8. The IM CN subsystem forwards the SIP 200 OK to the UE.        9. The UE sends an HTTP request to the URL obtained from the SIP 200 OK message.        10. The HTTP server delivers the content file in the HTTP response to the UE.        
The IMS controlled HTTP progressive download according to 3GPP TS 26.237 v 9.3.0 applies to the delivery of content files, but does not apply to the delivery of media presentations such as MPDs.