Media content streaming is becoming more popular over the Internet and recent research shows that streaming traffic constitutes the biggest percentage of the overall Internet traffic today and it is expected that this percentage will grow in the future as mobile devices (typically, smart phones) are used increasingly for mobile streaming services. With an increasing number of mobile phone users experiencing mobile streaming, there is a need for streaming optimizations so that streaming experience on mobile devices is improved.
Many streaming services on the Internet are based on the so-called “progressive download” technology, which involves receiving content as a continuous stream of data. Such services include ‘YouTube’ and ‘DailyMotion’. With “progressive download”, the client requests the media content from a streaming server (typically, with the HTTP protocol) and, after a short “pre-buffering” delay, the client starts playing the media content while the download process continues. In effect, playback and download operate in parallel. In this case, it is important that the download rate matches the playback rate. If the communication channel cannot support a sufficiently large download rate, the playback will stop since the client eventually runs out of media content. On the other hand, if the download rate is higher than the playback rate, the client will eventually run out of buffering resources and will start dropping further media content or will advise the streaming server to stop sending more content (e.g., by closing the TCP receive window).
The traffic rate of a typical progressive download streaming session is shown in FIG. 1. Specifically, the traffic rate is shown when a mobile device requests a YouTube video file. This streaming session is conducted over the HTTP protocol. Initially, during the so-called “pre-buffering period” 10, the streaming server sends data to the client as fast as possible, in order to quickly fill up the playback buffer of the client and make it possible to start the playback. After the pre-buffering period however, the server reduces its transmission rate to about 800 Kbps (during the rate-controlled period 12), so that the client receives data at least as fast as it consumes data and maintains its playback buffer sufficiently filled up to protect against short communication outages.
FIG. 2 shows again the traffic rate of a streaming session, when a Flash-enabled client (e.g., the Microsoft Internet Explorer) requests a YouTube video file. In this case and as before, the pre-buffering period 14 is followed by the rate-controlled period 16, where the traffic rate, in this example, is approximately constant at about 670 Kbps. The traffic rate during the rate-controlled period depends on the encoded rate of the requested video file (and, thus, on the quality of the video file). The larger the encoded rate (or the video quality), the higher the traffic rate.
One problem with progressive download streaming is that when the communication channel gets congested for some time and cannot support the required traffic rate (the rate that matches the playback rate), then the playback buffer of the client may run out of media content, thus causing the streaming session to stop (point 18 in FIG. 3). This creates a bad user experience. One easy way to tackle this problem is to implement clients with large playback buffers. However, such large buffers may considerably increase the pre-buffering period, which again has a negative impact on the user experience, plus it demands considerable memory resources in mobile devices.
There are currently several streaming technologies designed to operate efficiently over time-varying mobile communications channels, for example, adaptive progressive download streaming, adaptive HTTP Live Streaming (HLS) and 3GPP Dynamic Adaptive Streaming over HTTP (DASH).
With adaptive progressive download streaming, in order to adapt to varying channel conditions, the streaming server can modify its transmission rate based on Quality of Service (QoS) measurements reported by the client. This situation is schematically shown in FIG. 4. At the beginning of an adaptive rate-controlled period 24, the server transmits the media with the highest available quality, which corresponds to transmission rate 1. At some point, the communications channel becomes congested and cannot support the transmission rate 1. As a consequence, the client observes a transmission rate lower than transmission rate 1 for a given time period and thus, it reports this observation back to the streaming server. In response, the server switches to a lower transmission rate (at point 26), which corresponds to a lower-quality media content. Now the media content is transmitted with a transmission rate 2. If the communications channel is further congested, the channel may become incapable to support transmission rate 2 and, again, the server (after receiving the client's feedback) switches to an even lower rate (at point 28), transmission rate 3. This corresponds to even a lower streaming quality. Despite the lower quality, the streaming session continues. Of course, if the communications channel is later improved, the transmission rate may be scaled up to improve the streaming quality. As a consequence, the streaming session may run uninterrupted with quality that matches the observed channel conditions.
Disadvantages of adaptive progressive download is that (i) it is complex, since it requires the client to constantly derive QoS measurements and feedback these measurements to the server and (ii) it requires extra communication resources to support the signaling between the client and the server. In addition, when the transmission rate is reduced to adapt to deteriorating channel conditions, the streaming quality and, thus, the viewing experience are considerably reduced. So, high quality streaming cannot be guaranteed. A further disadvantage is that the media content at the server should be encoded with various encoding rates (i.e., with various levels of quality), which renders the whole process more complicated and costly.
An article entitled ‘Apple proposes HTTP streaming feature as IETF standard’ (retrievable via http://arstechnica.com/web/news/2009/07/apple-proposes-http-streaming-feature-as-a-protocol-standard.ars) gives a brief overview of HLS and DASH as described in 3GPP Technical Specification TS 26.247 (v10.0.0), entitled ‘Transparent end-to-end Packet-switched Streaming Service (PSS); Progressive Download and Dynamic Adaptive Streaming over HTTP (3GP-DASH)’.
HLS and DASH support media streaming over a single communication path (or a single radio access technology connection) and adaptive streaming by switching the rate of the encoded media content to match the available transport capacity. With these technologies, the client retrieves the media content as a sequence of small “chunks” of constant duration (typically, near to 10 s (seconds)). Each chunk can be selected from a plurality of chunks available for the same portion of the media content, e.g., a portion of media content may be represented by a high quality chunk, medium quality chunk, etc. The high quality chunk will have more information and so will be greater in size than the medium quality chunk. Every time the client retrieves a chunk, it measures the retrieval duration. When the duration exceeds a certain threshold (or when other implementation triggers apply), the client requests the next chunk with reduced quality and thus with reduced size that can be retrieved faster. This way, even when the capacity of the transport channel is reduced (e.g., due to channel variations and transmission effects), the streaming flow can be sustained, albeit with reduced quality.
An advantage of HLS/DASH technologies is that the streaming flow dynamically adapts to the available transport capacity and features considerable robustness and reliability in mobile communication systems. A disadvantage, however, is that the adaptation is performed at the expense of quality, that is, when the transport channel deteriorates and cannot support high data rate, the streaming flow is sustained but with reduced quality. For example, when a mobile device uses HLS or DASH to stream a video content over WiFi, the stream content will switch to low quality when the WiFi network gets congested (e.g., when it cannot support TCP throughput of more than 700-1000 Kbps) and cannot support the high quality stream. This has a negative effect on the user experience. Another disadvantage is that HLS/DASH technologies require the media content to be encoded in a special and somewhat complicated way (so that chunks are created). Therefore, media content that is already available in the Internet cannot be streamed with HLS/DASH technologies.
A difference between the HLS/DASH technologies and the adaptive progressive download technology is that the former utilizes client-based adaptation, whereas the latter utilizes server-based adaptation. With client-based adaptation, it is the client which decides when to modify the streaming quality (e.g., by requesting chunks of lower quality), whereas with server-based adaptation the streaming server decides when to change its transmission rate (and thus send content of lower quality) by taking into account the feedback provided by the client. Due to the complexity of the server-based adaptation, where the client and the server need to constantly exchange signaling to support the rate adaptation on the server side, the current trend is towards the client-based adaptation and HLS/DASH streaming. However, as noted above, the vast majority of the streaming content in the Internet today is not encoded for HLS/DASH streaming, so streaming with progressive download is expected to dominate for many more years.
In conclusion, HLS and 3GPP DASH do provide efficient and adaptive technologies for mobile media streaming over a single communications path (or a single radio technology), but they cannot maintain high quality streaming experience when the transport channel deteriorates significantly.
In an effort to improve the video streaming performance, several publications propose the use of multiple simultaneous IP interfaces to retrieve video chunks in parallel. For example, in an article entitled ‘Improving Internet Video Streaming Performance by Parallel TCP-based Request-Response Streams’ by Robert Kuschnig, Ingo Kofler and Hermann Hellwagner (retrievable via http://alvand.basu.ac.ir/˜nassiri/courses/AdvNetworks/papers/paper25.pdf), the authors propose a request-response-based client-driven streaming system, much similar to HLS/DASH, which however uses multiple interfaces to simultaneously retrieve multiple chunks. The reported performance results indicate that multipath chunk retrieval can provide an overall TCP throughput that is relatively stable over a vast range of round trip time (RTT) values and much higher compared to single path chunk retrieval. In this publication, however, the mobile device keeps multiple interfaces continuously active. As a result, such multipath streaming mechanisms can have a considerable negative impact on battery consumption.
Another known technology that can be used for enhanced streaming experience is Multipath TCP (MPTCP) (see, for example, the Multipath TCP Internet drafts which are retrievable via http://tools.ietf.org/wg/mptcp/).
With MPTCP, the client and the server establish multiple, parallel TCP connections (typically over different communication technologies) which are simultaneously used to retrieve the media content. The advantage is that MPTCP can provide increased overall throughput by exploiting the available capacity over multiple communication paths. So, when a HTTP streaming session is conducted on top of MPTCP, the overall transport capacity can be increased and thus improved streaming experience is expected since low-quality media chunks will rarely be required. A disadvantage however is that both the client and the server must be upgraded to support MPTCP which in most cases can be impractical. To alleviate this issue, an MPTCP proxy can be used between the client and the server as proposed in a presentation entitled ‘mptcp proxies’ by Mark Handley (retrievable via www.ietf.org/proceedings/80/slides/mptcp-4.ppt). In this case, only the client and the proxy need to support MPTCP and not the streaming/media servers. Yet, there is still a requirement to upgrade the network infrastructure. Also, the use of proxies (or other middle boxes) breaks the end-to-end transparency and can create several issues, e.g., with applications that carry transport information in the payload (FTP, SIP, RTSP, etc.). In addition, with MPTCP it is the server that performs the transmission scheduling (i.e., the allocation of media data to the available communication paths or TCP sub-flows) and therefore, the client cannot decide which path or radio access technology to use. This is an additional disadvantage because the server may choose to send data over the most expensive communication path and the client has no power to avoid or otherwise control this behaviour. In addition, as is the case with all streaming technologies that split the media content into chunks, the encoding process is somewhat complicated and is not compatible with the encoding process that is widely used currently in the Internet.