1. Technical Field
The present invention relates to a network unit adapted to transmit a video channel to be played at a nominal motion speed.
2. Description of Related Art
A network unit is for instance an access unit for connecting subscribers to a data communication network, such as a Digital Subscriber Line Access Multiplexer (DSLAM), an Ethernet bridge, an edge router, etc, or any intermediate network unit, which subscribers are coupled to, or a video server for supplying video channels to subscribers.
An example of a dedicated multimedia stream is a unicast stream bound to a particular unicast address, being a unicast network address such as a unicast IP address, or a unicast hardware address such as a unicast MAC address.
An example of a common multimedia stream is a multicast stream bound to a particular multicast address, being a multicast network address such as a multicast IP address, or a multicast hardware address such as a multicast MAC address. Another example of a common multimedia stream is a broadcast stream bound to a broadcast address, being a network broadcast address such as the broadcast IP address 255.255.255.255, or a broadcast hardware address such as the broadcast MAC address FF:FF:FF:FF:FF:FF:FF:FF.
A video channel is a sequence of video pictures or video frames that are displayed on screen at a nominal frame rate, typically at 25 or 30 frames per second (fps). The nominal frame rate value is chosen according to the human visual remanence, and is such that the viewer perceives successive video frames as a continuous motion sequence.
Picture information are digitally captured, encoded, transmitted over the air (e.g., by broadcast, wireless or mobile communication), by satellite, or via a wired communication network (e.g., via a local loop or an optical fiber), or read from a carrier medium (e.g., a Digital video Disc (DVD), a Compact Disc (CD) or a video tape), typically as part of a multimedia stream that comprises further multimedia information, such as one or more audio channel, one or more language sub-title, etc, and are ultimately decoded to refresh the displayed picture, nominally at the same rate as the nominal frame rate, which typically matches the nominal capture rate, thereby resulting in a nominal motion speed, that is to say a motion speed that fits the real-time perception of spatial motion.
Inter-frame encoding is used to reduce the encoded bit rate while maintaining an acceptable picture quality by coding a video frame with reference to previous and/or subsequent video frames.
Examples of video encoding using inter-frame encoding is MPEG1 (ISO-IEC), MPEG2 (ISO-IEC), MPEG4 (ISO-IEC), H.261 (ITU-T), H.263 (ITU-T), H.264 (ITU-T=MPEG4 part 10), VC-1 (SMPTE), Real video and ON2 Macromedia Flash proprietary codecs, etc.
In a motion sequence, individual frames are grouped into a Group of Pictures (GoP). A GoP comprises one independent frame (or intra-frame, or anchor frame, or key frame, or I frame) as reference frame, and further dependent frames (or inter-frames, or delta frames) that ultimately relate to this independent frame.
Independent frames are encoded without referencing any other video frame, e.g. by reducing the spatial redundancy in the picture. Dependent frames are encoded by means of forward and/or backward prediction techniques such as motion compensation, and do refer other video frames, thereby achieving a higher compression ratio. Examples of dependent frames are predictive frames (or P frames), which are encoded with reference to the previous I or P frame, and bi-directional interpolative frames (or B frames), which are encoded with reference to both the previous and the next I or P frame.
As an example, a typical GoP is a sequence of video frames of the form I B1 B2 P1 B3 B4 P2 B5 B6 P3 B7 B8, wherein P1 is encoded with reference to I, B1 and B2 with reference to both I and P1, P2 with reference to P1, B3 and B4 with reference to both P1 and P2, and so on, and eventually B7 and B8 with reference to P3 and the next incoming I frame.
The GoP size does not necessarily need to be fixed, and the encoder may decide on a per video frame basis which frame type to use.
It is to be noticed that video frames are re-ordered before being transmitted in such a way that they can be decoded at once as they are received without waiting for further incoming frames.
Referring to the previous example, the GoP will be transmitted as Ik B7k−1 B8k−1 P1k B1k B2k P2k B3k B4k P3k B5k B6k Ik+1 B7k B8k, and so on, where the subscript k denotes an arbitrary GoP index.
It is also to be noticed that a frame may relate to a frame of another GoP (open GoP versus closed GoP).
Referring to the previous example, the video sequence I B1 B2 P1 B3 B4 P2 B5 B6 P3 B7 B8 forms an open GoP.
When a user initiates a channel change from one video channel to another, the video replication point (the video server or an intermediate network unit to which the user is connected to or coupled to) stops sending frames of the prior video channel, and starts sending frames of the new channel.
Meanwhile, the decoding device flushes its decoding (or de-jitter) buffer and waits for video frames of the newly requested channel. The first received frame is likely to not be an independent frame, making decoding (and thus displaying) of the new channel impossible until a new independent frame is received. Thereupon, the decoding device shall still wait for a sufficient number of frames to be received before continuously playing the video channel at the nominal motion speed.
The decoding device may also experience some time variations in receiving video packets (also referred to as packet jitter), and/or video packets may be lost and/or corrupted. Consequently, the decoding device may buffer even more video frames so that a continuous playout of the video channel can be ensured in such an adverse environment.
As a conclusion, the time between channel change and channel display can be significantly higher than the duration of one GOP, giving the user a slow-responding zapping experience.
Different solutions that optimize the effective channel switching time and/or that improve the user experience are known from the art.
A first optimization consists in caching the last-forwarded independent frame at a replication point, and in transmitting the so-cached independent frame upon channel change. The decoding device quickly receives and decodes this I frame. While a new GoP is awaited for, a still image is displayed of the single decoded I frame. This results in an improved user experience because the user already sees the new channel albeit in a still image.
A further optimization consists in caching a few GoPs of the video channel at a video replication point, and to dedicatedly supply a video sequence, which starts with the last-forwarded independent frame, at a higher transmit rate, typically at 1.3 times the nominal frame rate. After a while, the decoding device is expected to catch up the difference with the steady state users that are supplied with a common multimedia stream, and from that time onwards, to switch to common transmission mode.
This solution is disadvantageous in that it brings about traffic burstiness upon channel change, and in that the network shall be consequently and accordingly over-dimensioned. Another drawback of this solution is that steady state users are not synchronously viewing the video channel, since the amount of video frames that are buffered depends on the time occurrence of a channel change with respect to the GoP that is currently being transmitted: the further the last transmitted independent frame, the more frames in the decoding buffer, the more watching delay.
Further optimization consists in dropping lesser important frames within the dedicatedly supplied video sequence, such as B frames, thereby reducing the induced network overload.
Still a further optimization is to dedicatedly supply a lower-quality copy of the requested channel.