Hybrid video coding techniques have been widely adopted in video coding standards like H.263, MPEG-2 and H.264/MPEG-4 AVC. Intensive work has been carried out on improving the visual quality within a given bit rate constraint, using the existing coding tools. Generally CBR (constant bit rate control) and VBR (variable bit rate control) are used to meet the trade-off between quality and rate constraint for different applications. In CBR mode, the number of bits that can be transmitted to a video decoder in a given time interval is typically fixed. The decoder side will also use a buffer of specified size referred to as the video buffer verifier (VBV) in MPEG2 and MPEG4-2 or as Hypothetical Reference Decoder (HRD) in H.263 and MPEG4-AVC/H.264. Related applications are e.g. TV broadcast, cable transmission, and wireless communication of compressed video. In VBR mode, the total number of bit used to compress a long sequence of video is typically fixed, while limits on instantaneous bit rate are practically non-existent. Related applications are stored media applications like DVD (Digital Versatile Discs) and PVR (Personal Video Recorder).
Due to the high variability in the picture content present in many video sources, a long video sequence can be divided into consecutive video shots. A video shot may be defined as a sequence of frames captured by “a single camera in a single continuous action in time and space”. Usually it is a group of frames that have consistent visual characters (including colour, texture, motion, etc.). Therefore a large number of different types of scene changes or scene cuts can exist between such shots. A scene cut is an abrupt transition between two adjacent frames. Electronic scene cut detection as such is known. A common method is to use histograms for comparing consecutive video frames H. J. Zhang, A. Kankanhalli and S. W. Smoliar, “Automatic partitioning of full-motion video”, Multimedia Systems, volume 1, pages 10-28, 1993, Springer Verlag. In Z. Cernekova, I. Pitas, Ch. Nikou, “Information Theory-Based Shot Cut/Fade Detection and Video Summarization”, IEEE CSVT, pages 82-91, 2006, a Mutual Information (MI) is used for detecting scene cuts.
FIG. 1 shows that the picture ‘n’ following a scene cut SCC is usually coded as an intra frame picture I, and different bit allocation schemes will be adopted in CBR and VBR processing. In case of CBR, the encoder will try to keep the bit rate RCBR constant, as FIG. 1 illustrates, which will often cause serious picture quality degradation at scene changes. In case of VBR, more bits will be allocated to frame ‘n’ and the bit rate RVBR will increase significantly for a short time. Usually, subsequent frames will be coded in ‘skipped’ mode according to buffer constraint or transmission rate constraint, i.e. RVBR will be nearly zero as FIG. 1 illustrates in order to soon return to the average bit rate for the video sequence, which will often cause jerk artefacts in the video display. If an encoder does not handle a scene change well, the encoder will usually consider the picture following the scene cut (i.e. the first picture of the new scene) as a picture similar to the previous one and therefore allocate bits accordingly. If this picture is coded as a P or B frame, the picture quality will be seriously deteriorated due to less allocated bits.
Also, in most rate control algorithms the parameters from coding previous pictures are usually used as candidate parameters for coding future pictures, which is not appropriate when a scene change occurs. This also results in a quality break and the more accurate the bit rate control is, the more severe the problem is.
US-A-2005/0286629 proposes a method for scene cut coding using non-reference frames, wherein the scene cut frame and its neighbouring frames (before and after) are coded as non-reference frames (B frame type, as shown in FIG. 2) with increased quantisation parameters QP (i.e. with coarser quantisation) in order to reduce the bandwidth. But in this case, the coding efficiency of the first P frame following the scene cut is very low due to the long prediction distance, and a longer picture delay is required. A better performance in the B frame coding and a good trade-off between quality and rate constraint can not be assured.