Flash lighting is widely used in taking photographs. When video sequences are obtained from e.g. news, interviews, conferences and sports matches, flash light often appears in the video due to photographing by e.g. journalists. A typical characteristic of a flash picture is that its intensity or brightness increases abruptly so that the flash picture has a much stronger intensity than the previous and the following pictures in the video sequence. Another characteristic of a flash picture is that the intensity change is non-uniformly distributed within the entire picture. That is, some parts of the picture may have a greater intensity increase than other parts. Moreover, due to the different distances from the flashlight or due to shelter and the shadow, it is hard to find an accurate model for estimating the change of the intensity within the picture.
For the above two reasons, some unusual phenomena will be noticed when the video is encoded by existing video coding technologies, such as MPEG-2, H.263, MPEG-4 AVC/H.264 and VC-1, are based on a hybrid video coding processing and use motion estimation to reduce the temporal redundancy. The motion estimation is block-based and tries to find the best-matching block by determining the minimum sum of the absolute difference (SAD) values of the residues. However, when flash happens for example in picture Pn in FIG. 1, the intensity changes a lot so that the motion estimation can not find a well-matching block in a previous picture Pn−1 or Pn−2. Accordingly, the video encoder usually tends to encode picture Pn in intra mode, since in this case the intra coding can achieve a little better rate-distortion performance than coding in inter mode. Nevertheless, no matter in which mode the blocks or macroblocks of this picture are coded, a great amount of bits will be produced so that the whole flash picture Pn will usually generate much more bits than the neighbouring non-flash pictures Pn−1 and Pn+1, and this will cause a significant bit rate fluctuation for transmission.
If only one reference frame is used, the encoding of the non-flash picture Pn+1 which is following the flash picture Pn will again meet the same problem in that the motion estimation for the non-flash picture Pn+1 can not find the matched block in the flash picture Pn since there is a big intensity difference between the two pictures. Consequently, again a lot of bits are generated for the non-flash picture Pn+1 Fortunately, the multiple reference frames processing feature in H.264/AVC solves this problem. The blocks or macroblocks of non-flash picture Pn+1 can be predicted from the other non-flash picture Pn−1 and hence the encoding of picture Pn+1 will not produce a large amount of bits. However, the multiple reference frames still can not prevent the encoding of the flash picture Pn from producing too many bits.
For H.264/AVC Main and extended profiles, another approach denoted ‘weighted prediction’ has been proposed by J. M. Boyce, “Weighted prediction in the H.264/MPEG AVC video coding standard”, IEEE 2004, ISCAS 2004, in order to deal with the problem of coding fade-in, fade-out, and at the same time it tries to reduce the bit rate of coding a flash picture to some extent. There are two weighted prediction modes: explicit mode, which is supported in P, SP, and B slices, and implicit mode, which is supported in B slices only. In the explicit mode, weighting factors (including multiplicative weighting factors and the additive offsets) are transmitted in the bit stream, while in the implicit mode the weighting factors are instead derived based on relative distances between the current picture and the reference pictures.