Transmission of video over data networks, such as the Internet, is commonplace today. To receive such signals, a user can use a suitably configured computer or other receiver such as a “set top box” (STB). STBs have become increasingly popular and many are provided with an IP connection allowing content such as video to be streamed or downloaded over the Internet. Television delivered over the Internet, commonly referred to as IPTV, is a good example of this growing service.
When streaming video data over an IP network, there are no guarantees that the data sent will reach its destination. When the network experiences congestion and other problems, delays will occur to the transmission of the data packets and some packets may even be lost.
To provide more reliable end-to-end delivery of data, the transmission control protocol (TCP) is often used as the transport protocol. Indeed, it is quite common to use TCP in video streaming systems for a number of reasons, but primarily because TCP provides mechanisms for ensuring reliable delivery, and managing network congestion. For example, one way in which TCP achieves reliability is by obliging the receiver to acknowledge to the sender all data received. If a packet of data remains unacknowledged after a predetermined period of time, TCP assumes the packet was not received and the same packet is retransmitted by the sender. One way that TCP manages congestion is by reducing the transmission rate of data as a function of congestion in the network.
Take the scenario where a number of video streams are being delivered using TCP and all share a contended piece of network. When congestion occurs, the TCP congestion control algorithm will force all the streams to back off their delivery rate to allow the congestion to clear. Each stream backs off by a fixed factor and eventually all streams will stabilise at approximately the same bandwidth (assuming a similar round trip time).
Use of such a method is not without problems. If the bandwidth becomes less than that required by the video content, play-out of the video could be stalled until sufficient data has been received to restart play-out. This situation can be mitigated by buffering data at the receiver having previously received it faster than necessary for play-out, and by switching the quality of the video transmitted, so that the required bandwidth is reduced to less than or equal to that now provided by the network.
Rate-adaptive, variable bit rate, video streams, where the transmitted video quality or bit rate is adapted over time, are also sometimes delivered over TCP. However, the above congestion scenario may still occur, and two streams each having a different average encoded bitrate for the same video quaity will still stabilise to roughly the same reduced transmission bitrate when the network is congested. This may result in some particularly undesirable results where, a first stream is initially encoded at a high bitrate, for example a video sequence with high frame activity such as a sports sequence, and a second sequence is encoded at a low bit rate, for example a video sequence with a low frame activity such as a news or drama sequence.
When congestion is experienced in the network, TCP will cut the available bandwidth for both streams to roughly the same level. This will affect the first stream, which was encoded at a higher bitrate and thus has a higher bandwidth requirement, more than the second stream, which was encoded at a lower bitrate and thus may still have enough bandwidth. Put another way, the first, high bitrate, stream will be more significantly affected than the second, low bitrate stream, as the first stream is given the same reduced bandwidth as the second stream. This will cause the quality of the video delivered to each user to vary over time, and the quality to vary from user to user depending on the type of video clip they are viewing.
Another way of streaming video that mitigates some of these problems experienced under TCP is to use a constant bitrate delivery system where the bitrate available to a video stream is fixed, for example by a reservation scheme, before the transmission of data starts. This method of delivery is easier to manage, but is not without its problems.
Again, take the example of the two video streams above, where we have a first stream that has very active frames such as a sports clip, and a second stream with less active frames such as a news clip. The bitrates reserved and used to deliver the two streams are fixed at a predetermined rate (that is considered to be sufficient for most applications and in this case for both streams). However, the second stream will not actually require that much bandwidth as the bitrate of the encoding can be much lower than that of the first sequence given that the activity in the second sequence is much less. The second stream transmitted using this fixed bandwidth is thus wasting much of its bandwidth. If the second stream increases the encoding rate so as to utilise the entire bandwidth reserved, the quality of the resulting video is likely to be of a lot higher quality than the first stream. However, this increase in quality may not necessarily be significant as perceived by the viewer and may thus be wasted. Moreover, having this redundant bandwidth is not an efficient use of network resources.
The problems above are heightened when one starts considering video sequences that vary in activity during the sequence itself. For example a relatively static news reading sequence might be interspersed with highlights of very active football clips.
International patent WO2008/119954 describes a method of delivering video streams over a contended network, where each stream delivered at a constant quality.
International patent WO2004/047455 describes a method of delivering a variable bit rate sequence over a network at a piecewise constant bit rate, with the rate of each piece decreasing monotonically. The resulting bit rate profile is referred to as a “downstairs” function.
U.S. Pat. No. B1-6,259,733 describes a method for statistical multiplexing, where multiple video sources are encoded at the same time and multiplexed into a single channel for transmission. The video sources are analysed for spatial and temporal complexity to get a relative need of bit rate, which is scaled according to an importance factor (high for movies, low for news for example), and which is then used to divide up the bandwidth.
US patent application 2006/224762 describes a method of estimating an encoding complexity for video sequences, and using that estimated encoding complexity to determine a bit rate for encoding.