Transmission of video over data networks, such as the Internet, is commonplace today. To receive such signals, a user can use a suitably configured computer or other receiver such as a “set top box” (STB). STBs have become increasingly popular and many are provided with IP connection allowing content such as video to be streamed or downloaded over the Internet. Television delivered over the Internet, commonly referred to as IPTV, is a good example of this growing service.
When streaming video data over an IP network, there are no guarantees that the data sent will reach its destination. When the network experiences congestion and other problems, delays will occur to the transmission of the data packets and some packets may even be lost.
To provide more reliable end-to-end delivery of data, the transmission control protocol (TCP) is often used as the transport protocol. Indeed, it is quite common to use TCP in video streaming systems for a number of reasons, but primarily because TCP provides mechanisms for ensuring reliable delivery, and managing network congestion. For example, one way in which TCP achieves reliability is by obliging the receiver to acknowledge to the sender any data received. If a packet of data remains unacknowledged after a predetermined period of time, TCP assumes the packet was not received and the same packet is retransmitted by the sender. One way that TCP manages congestion is by reducing the transmission rate of data as a function of congestion in the network.
Take the scenario where a number of video streams are being delivered using TCP and all share a contended piece of network. When congestion occurs, the TCP congestion control algorithm will force all the streams to back off their delivery rate to allow the congestion to clear. Each stream backs off by a fixed factor and eventually all streams will stabilise at approximately the same bandwidth (assuming a similar round trip time). Use of such a method is not without problems as delays to segments of the video streams are particularly undesirable. This is can be mitigated at least in part using various techniques such as using receiver buffers and dropping occasional segments and relying on error recovery instead.
Video streams are also sometimes delivered at a variable bitrate over TCP. However, the above congestion scenario may still occur, and two streams each having a different bitrate will still stabilise to roughly the same reduced bitrate when the network is congested. This may result in some particularly undesirable results where a first stream is initially encoded at a high bitrate, for example a video sequence with high frame activity such as a sports sequence, and a second sequence is encoded at a low bit rate, for example a video sequence with a low frame activity such as a news or drama sequence.
When congestion is experienced in the network, TCP will cut the available bandwidth for both streams to roughly the same level. This will affect the first stream, which was encoded at a higher bitrate and thus has a higher bandwidth requirement, more than the second stream, which might was encoded at a lower bitrate and thus may still have enough bandwidth to stream its low bitrate stream. Put another way, the first, high bitrate, stream will be more significantly affected than the second, low bitrate stream, as the first stream is given the same reduced bandwidth as the second stream. This will cause the quality of the video delivered to each user to vary over time, and the quality to vary from user to user depending on the type of video clip they are viewing.
Another way of streaming video that mitigates some of these problems experienced under TCP is to use a constant bitrate delivery system where the bitrate available to a video stream is fixed, for example by a reservation scheme, before the transmission of data starts. This method of delivery is easier to manage, but is not without its problems.
Again, taking the example of the two video streams above, where we have a first stream that has very active frames such as a sports clip, and a second stream with less active frames such as a news clip. The bitrate reserved and used to deliver the two streams are fixed at a predetermined rate (that is considered to be sufficient for most applications and in this case for both streams). However, the second stream will not actually require that much bandwidth as the bitrate of the encoding can be much lower than that of the first sequence given that the activity in the second sequence is much less. The second stream transmitted using this fixed bandwidth is thus wasting much of its bandwidth. If the second stream increases the encoding rate so as to utilise the entire bandwidth reserved, the quality of the resulting video is likely to be of a lot higher quality than the first stream. However, this increase in quality may not necessarily be significant as perceived by the viewer and may thus be wasted. Moreover, having this redundant bandwidth is not an efficient use of network resources.
The problems above are heightened when you start considering video sequences that vary in activity during the sequence itself. For example a relatively static news reading sequence might be interspersed with highlights of very active football clips.