As will be familiar to a person skilled in the art, various techniques exist for encoding video and/or audio streams for transmission over a network.
For instance, video coding commonly uses two types of video frames: intra-frames and inter-frames. An intra-frame is compressed using only the current video frame, i.e. intra-frame prediction, similarly to static image coding. An inter-frame on the other hand is compressed using the knowledge of one of the previously decoded frames, and allows for much more efficient compression when there are relatively few changes over time in the scene being encoded. Inter-frame coding is particularly efficient for, e.g., a talking-head with static background, typical in video conferencing. Depending on the resolution, frame-rate, bit-rate and scene, an intra-frame can require up to 20-100 times more data than an inter-frame. On the other hand, an inter-frame imposes a dependency relation to previous inter-frames up to the most recent intra-frame. If any of those frames are missing, decoding the current inter-frame may result in errors and artifacts.
These ideas are used for example in the H.264/AVC standard (see T. Wiegand, G. J. Sullivan, G. Bjontegaard, A. Luthra: “Overview of the H.264/AVC video coding standard,” in IEEE Transactions on Circuits and Systems for Video Technology, Volume: 13, Issue: 7, page(s): 560-576, July 2003).
Frequent and periodic transmission of intra-frames is common in video streaming. These periodically transmitted intra-frames are sometimes referred to as “key-frames”. The idea is illustrated schematically in FIG. 1, where key frames 1, 5, 9 and 13 etc. (shown black) are interleaved periodically between the transmission of inter-frames 2-4, 6-8, 10-12 and 14-16 etc. (shown white). The key frames are needed for two main reasons. Firstly, when a new user joins the session, he/she can only start decoding the video when a key-frame is received. Secondly, on packet loss, particularly bursty packet loss, the key-frame is a way to recover the lost coding state for proper decoding. The key frames allow the receiver to periodically update with “absolute” data, not relying on encoding relatively to previous frames, thus avoiding errors that could otherwise propagate due to packet loss. However, the larger sizes of key frames also incur a larger bandwidth for transmission.
Therefore conventionally it is necessary to try balance the bandwidth cost of transmitting intra-frames too frequently against the effect of packet-loss which may be caused by transmitting too few intra-frames. This may be a particular problem where the stream is intended for two or more recipients. Referring to FIGS. 5 and 7a, suppose for the sake of example that a transmitting node 102(X) is to transmit a video stream to two recipient nodes 102(Y) and 102(Z) over a packet based network 108 such as the Internet. These three nodes may be referred to as nodes X, Y and Z for brevity. Each of the recipients Y and Z will set up a respective one-to-one connection with the transmitting node X over the network 108, and X will encode the video stream for transmission over each of those connections. However, it may be that the connection (or channel) from X to Z can support a higher bandwidth than the connection from X to Y, or experiences a worse rate of packet loss. In that case a stream comprising a higher rate of intra-frames (and therefore a higher coding rate incurring a higher bandwidth) would be more appropriate for transmission to Z, but a stream comprising a lower rate of intra-frames would be more appropriate for transmission to Y. Because they are less efficiently encoded and so require more data, it would be desirable to avoid unnecessary transmission of intra-frames. The transmitting node X could generate two different versions of the streams, but that may incur an unnecessary processing cost at the transmitter X.
A similar problem may occur more generally for example if the transmitting node X has a choice of different available codecs or coding options for encoding an audio or video stream. One codec may result in a higher bitrate encoded stream (i.e. higher coding rate) which would be more suitable for transmission over the connection from X to Z, whereas another codec may result in a lower bitrate encoded stream and therefore be more suitable for transmission over the connection from X to Y. In such a situation, it may conventionally be necessary to transmit using a codec that is not optimal for one of the two connections, or perhaps a third codec which is a compromise between the two. The transmitting node could alternatively encode two versions of the stream with different coding rates, but that would be wasteful of processing resources at the transmitting node X.
It would be desirable to try to mitigate these problems to some extent.