Streaming technologies are used for delivering media, e.g. multimedia, provided by a streaming provider to an end-user such that the media is constantly received by and presented to the end-user. Hypertext Transfer Protocol (HTTP) streaming is a mechanism for sending data from a web server to a web browser. HTTP streaming (also known as HTTP server push or push technology) can be achieved through several mechanisms.
Adaptive HTTP streaming is becoming the dominant content streaming technique. Adaptive Streaming (or Adaptive Bitrate Streaming) is a technique used in streaming multimedia over networks like computer networks. Today's adaptive streaming technologies are almost exclusively based on HTTP and designed to work efficiently over large distributed HTTP networks such as the Internet. In principal, adaptive streaming works by detecting a user's bandwidth and CPU capacity in real time and adjusting the quality of a video stream accordingly. It requires the use of an encoder which can encode a single source video at multiple bit rates. The player client switches between streaming the different encodings depending on available resources.
A number of different techniques exist such as Apple's HTTP Live Streaming (HLS), Microsoft's Smooth streaming (ISM) and 3GP/Moving Picture Expert Group (MPEG) DASH.
Those adaptive HTTP streaming technique all have common principles: The client receives the content stream as a sequence of files, or as a sequence of byte-range requests, which is then decoded and played as a continuous media stream. The Uniform Resource Locators (URLs) of the file sequence are described in a manifest file, which is an .m3u8 playlist in case of Apple's HLS, an .ismc in case of Microsoft's ISM and an .MPD in case of DASH.
The main principles of adaptive HTTP streaming are illustrated in FIG. 1. At first, the client 1 requests the manifest file from the server 2 by means of a “HTTP GET manifest file” request. Then, the server 2 transmits the manifest file to the client 1. The client 1 processes the manifest file and requests the first segment (e.g., with the lowest available media data rate (the lowest available quality)), as specified in the manifest file, from the server 2. During download of the manifest file, the client 1 measures the download speed and uses this estimation to select an appropriate representation (an appropriate quality) for the next (second) segment. For example, the client 1 selects a medium available media data rate (a medium available quality). The next segment is downloaded by the client 1 with a data rate slightly higher than the media data rate of the segment (otherwise, the media like a video will frequently stop playing). During the download of the second segment, the client 1 again measures the download speed.
In short, the client fetches one media segment (file) after each other as described in the manifest file. During file download, the client estimates the available link bitrate (download speed). Depending on the difference between the available link bitrate and the encoded bitrate of the media, the client selects an appropriate quality representation (normally slightly lower than the measured link bitrate).
To prepare a continuous stream of content for adaptive HTTP streaming, the stream is segmented into media segments (files) on the server side. These media segments are fetched by the client one-by-one as independent files. The client plays the segments contiguously and thereby provides a continuous stream playout. This is also illustrated in FIG. 2.
Adaptive HTTP streaming servers provide the clients with a list of different representations (typically bitrates) to choose from on a fragment basis, so as to be able to adapt the media bitrate to the available link bitrate. This is a client-centric approach, and aims at providing an interrupt-free viewing experience on the client screen, and does not take into account other clients.
Active clients (per cell or limiting link) will adapt to the link bitrate they experience, and adapt to approximately equal average media bitrates.
However, the perceived media quality (e.g., perceived video quality) as experienced by a user of the client does not only depend on the bitrate, but also very much on the type of media content. For example, sports content typically requires double the media bitrate compared to talk-shows, to achieve the same perceived quality in terms of subjective video quality. Subjective media quality measures are used to deal with subjective characteristics of media (e.g., video) quality. These measures are concerned with how media, like video, is perceived by a viewer and designates his or her opinion on a particular video sequence. There are different ways for measuring the perceived quality. One way is the so-called Mean Opinion Score (MOS).
The MOS is generated by averaging the results of a set of standard, subjective tests where a number of listeners rate the viewed video quality (or the heard audio quality) of test sequences. A viewer is required to give each sequence a rating using the following rating scheme: 5=Excellent, 4=Good, 3=Fair, 2=Poor, 1=Bad. The MOS is the arithmetic mean of all the individual scores, and can range from 1 (worst) to 5 (best).
As stated above, different media contents require different bit rates in order to be perceived as having the same quality. For example, sports content typically requires double the media bitrate compared to talk-shows, to achieve the same MOS. However, with since all adaptation is done in the client, on an individual basis, there is no way that content of those sessions that require higher media bit rate will also get a higher link bit rate.
In this context, FIG. 3 illustrates two mobile clients 1a, 1b. A first mobile client 1a currently downloads sports content from a server 2 via an available link. A second mobile client 1b currently downloads talk-show content from the server 2 via the same available link. Both contents, i.e. the sports content and the talk-show content are downloaded with the same download speed (bit rate). In other words, the available bit rate is shared equally between the two different content streams. In consequence, although the different content streams are downloaded with the same bit rate, the perceived quality of the sports content will presumably be worse than the perceived quality of the talk-show content, as the sports content requires a higher bit rate in order to achieve the same perceived quality as the talk-show content.
For adaptive streaming in general, it has been proposed in “QoE-Driven Cross-Layer Optimization for High Speed Downlink Packet Access”, Journal of Comm., Vol. 4, No. 9, 2009, that one shall do a Quality of Experience (QoE)-driven Cross-Layer optimization using a utility function for each content that maps bitrate to MOS. One can thereby maximize the number of satisfied users in a mobile network.
Furthermore, in “QoE-based rate adaptation scheme selection for resource-constrained wireless video transmission”, Proceedings of MM '10, ACM media is transcoded in the network to be based on such utility functions to optimize the total perceived video quality. However, this scheme is not directly applicable for the case of adaptive HTTP streaming, where the client is choosing the version of the media that corresponds best to the available bandwidth.