Various point-to-multipoint (PTM) distribution systems for multimedia data exist today. For instance, TV channels may be delivered via the Internet (IPTV) or in a mobile environment (Mobile TV) using digital multicast or broadcast techniques. Current standards for Mobile TV are for example the Digital Video Broadcast-Handhelds (DVB-H) framework or the Multimedia Broadcast/Multicast Service (MBMS) feature of the 3rd Generation Partnership Project (3GPP).
In such systems, media data such as mobile TV data are represented as a stream of pictures, each picture comprising frame (an image captured at some instance in time) information or at least some field information (any information that may contribute to an image at some time instance). Normally on a TV channel a compressed video stream is transmitted that is encoded using so-called predictive coding techniques. An intra-coded picture (I-picture) allows instantaneous decoding, because the picture is coded without reference to any other picture. Such I-pictures are typically generated by an encoder to create a point in time at which a decoder can start a proper decoding. In contrast to I-pictures, predictively-coded pictures (P-pictures) are encoded such that further information from previous pictures is required in order to entirely decode the P-frame to an image. A P-picture may contain image data, motion vector displacements and/or combinations of such data. P-pictures may be encoded using a previously encoded I-picture and/or P-picture(s) as references. Bidirectional predictively-coded frames or pictures (B-pictures) may additionally or alternatively include information related to future pictures. The distance between two consecutive I-pictures in the stream is denoted as the GOP (Group Of Pictures)-size. As I-pictures are more costly in terms of bit rate than P-pictures, in order for a high compression ratio it is advantageous to have a large GOP size, i.e. a high number of P- (or B-)pictures in the stream. As a decoder in a TV receiver has to wait for an I-picture to start decoding, a low frequency of I-pictures in the stream means that after a channel switching request of a user a considerable delay may occur until the receiver has tuned to the required new channel and starts the play-out.
Techniques for reducing this channel switching delay are generally referred to as ‘fast channel switching’ techniques. For fast channel switching, in addition to the “primary channels” for transmitting the actual TV streams, a further “secondary channel” may be provided for the transmission of supplemental I-pictures for each of the primary channels with which the secondary channel is associated. For example, in a mobile TV system as provided over a mobile network, a single secondary channel may be provided for 4-5 primary channels.
In order to limit the additional bandwidth requirements associated with the additional secondary channel, typically the supplemental I-pictures are of lower quality as compared to I-pictures in the primary channels. Thus, the video quality after tune-in is momentarily decreased until reception of the first I-picture of the tuned primary channel. The stream of supplemental I-pictures in the secondary channel may also have a reduced picture rate compared to the corresponding primary channel(s), i.e. not every P-picture in the primary channel has a corresponding supplemental I-picture in the secondary channel. However, even with a reduced frame rate, the channel switching delay can be considerably reduced. For example, a picture rate equal to 1 picture per GOP size in the primary channel may reduce the delay on average by half.