Transitions from one picture to the next picture of a video signal are often characterized by either some motion in the new picture with reference to a previous picture that may have undergone a compression or processing operation, the appearance or disappearance of an object or part thereof, or the emergence of a new scene altogether that may render the previous coded, processed, or original picture less suitable for use as a prediction reference. Most of these events can be modeled using motion compensation (a special case of inter prediction) and intra-frame prediction. Motion compensation is used to predict samples in the current picture using samples in one or more previously coded pictures. More specifically, in block-based motion compensation, a block of samples in the current picture is predicted from a block of samples from some already decoded reference picture. The latter is known as a prediction block. A prediction block may be as simple as the collocated block in some previous coded picture, which corresponds to a motion vector of all-zeroes. To account for motion, however, a motion vector is transmitted that instructs the decoder to use some other, displaced, block that is a closer match to the block that is being predicted. The motion model can be as simple as translational, where the motion parameters consist of a horizontal and a vertical displacement motion vector, or as complex as the affine or perspective motion models that require 6 or 8 motion parameters. More complex motion compensation schemes may also yield prediction blocks that are combinations of multiple blocks that correspond to different motion parameters. However, video signals may also contain global or local illumination changes that cannot be effectively modeled by motion compensation (inter prediction) or intra prediction. These illumination changes are usually found as fades, cross-fades, flashes, and other local illumination changes, which may for example be caused by the presence of multiple lighting sources. Weighted prediction (WP), e.g., illumination compensation, can benefit prediction efficiency for fades, cross-fades, flashes, and other local illumination changes. Weighted prediction consists of weighting (multiplying) the color component samples, e.g. luma and/or chroma samples, with a gain, which is further augmented by adding an additional offset. Please note that within this disclosure, color parameters or color components may be used to refer to the individual components that comprise a color domain or space. Please also note that in some domains or spaces, the color components may comprise intensity related components and color related components. Intensity related components may comprise one or more of a luma value or a luminance value and color related components may comprise one or more of a chroma value or a chrominance value. State-of-the-art codecs, such as H.264/AVC, support weighted prediction of the samples that may be in one of many possible color spaces/domains. Weighted prediction is also useful for temporal pre- and post-filtering that can be used to reduce sensor noise, compression or other artifacts, and temporal flicker/inconsistencies, among others. In practice, the image processing operations responsible for the illumination change may not necessarily originate in the domain used to compress or process the image, e.g. the YCbCr domain. Intuition, supported by experimental findings, shows that these operations are usually conducted in some other domain, usually the sRGB domain (sRGB is a widely used RGB color space for PCs and digital cameras), which is also closer to the human perception notion of color information. Note that there is a multitude of possible YCbCr conversion formulas to and from the RGB color space. Also, the issue of the RGB values having been gamma corrected prior or after the operations that created the illumination changes must be accounted for. Apart from illumination changes that were created by processing (primarily fades and cross-fades), there are also illumination changes that are part of the content, such as global or local flashes, varying illumination from flashlights and light fixtures, and shifting natural lighting, among others.
Weighted prediction parameters, e.g. gain w and offset f, which are used for illumination compensation of sample s from sample p as s=w×p+f are derived through some kind of weighted prediction parameter search/estimation (WP search). In its most straightforward and highest complexity form, one may use a brute force search scheme that considers all possible combinations of gains and offsets within some constrained search window, similar to the brute force, full search scheme employed for motion estimation, compute the distortion/similarity of the illumination compensated reference signals compared to the source signals, and select the illumination compensation parameters that result in minimum distortion. Motion estimation and compensation may also be considered during such a search. Full search is however computationally intensive. A variety of weighted parameter estimation techniques have been proposed that estimate the “optimal” gain and offset for some given block, region, or even an entire frame (for global illumination compensation). See, for example, K. Kamikura, et al. “Global Brightness-Variation Compensation for Video Coding,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 8, no, 8, December 1998, pp. 988-1000, which describes a global brightness-variation compensation scheme to improve the coding efficiency for video scenes that contain global brightness variations caused by fade in/out, camera-iris adjustment, flicker, illumination change, etc. See also Y. Kikuchi and T. Chujoh, “Interpolation coefficient adaptation in multi-frame interpolative prediction”, Joint Video Team of ISO/IEC MPEG and ITU-T VCEG, JVT-C103, March 2002 and H. Kato and Y. Nakajima, “Weighting factor determination algorithm for H.264/MPEG-4 AVC weighted prediction,” Proc. IEEE 6th Workshop on Multimedia Signal Proc., Siena, Italy, October 2004.
The methods, as described in the references discussed above, have to be applied separately to every color component for the best possible results. Weighted parameter search may benefit also from motion estimation and compensation. See, for example, J. M. Boyce, “Weighted prediction in the H.264/MPEG-4 AVC video coding standard,” Proc. IEEE International Symposium on Circuits and Systems, Vancouver, Canada, May 2004, vol. 3, pp. 789-792. In the H.264/AVC standard, motion compensation of the samples is followed by application of weighted prediction to yield the final predicted samples. Hence, during WP search, there is interdependency between the motion vectors and the weighted parameters. This can be addressed by performing multiple iterations as follows: weighted parameters are initialized with some simple algorithm and are used to scale and offset the reference frame. The scaled and offset reference frame is then used for motion estimation that yields motion vectors. Alternatively, scaling and offsetting of the samples may be incorporated in the motion estimation step. In such an implementation, scale and offset are considered on-the-fly and there is no need for a discrete step that creates a weighted reference. In the second iteration these motion vectors are used during WP search so that for each MB the actual prediction reference block is used to derive the WP parameters. This is then again followed by generation of a scaled and offset reference frame that undergoes motion estimation or a single motion estimation step that considers the already derived scale and offset. Usually, these algorithms terminate if the WP parameters have converged.
As indicated above, methods known in the art for addressing illumination compensation may be computationally intensive, resulting in decreased performance for imagery display or increased hardware costs to provide desired performance. Note that even though the terms frame and picture may be used interchangeably, such interchangeable usage should not be interpreted to exclude interlace-scan content such as field pictures. The teachings of this disclosure are applicable both to progressively-scanned frames as well as interlace-scanned field (top or bottom) pictures.