A Multi-view Video Coding (MVC) sequence is a set of two or more video sequences that capture the same scene from different view points. It has been widely recognized that Multi-view Video Coding is a key technology that serves a wide variety of applications, including free-viewpoint and 3D video applications, home entertainment, surveillance, and so forth. In those multi-view applications, a very large amount of video data is often involved.
In a practical scenario, Multi-view Video Coding systems involving a large number of cameras are built using heterogeneous cameras, or cameras that have not been perfectly calibrated. This leads to differences in luminance and chrominance when the same parts of a scene are viewed with different cameras. Moreover, camera distance and positioning also affects illumination, in the sense that the same surface may reflect light differently when perceived from different angles. Under these scenarios, luminance and chrominance differences will decrease the efficiency of cross-view prediction.
Several prior art approaches have been developed to solve the illumination mismatch problem between pairs of images. In a first prior art approach, it is decided based on cross entropy values whether to apply a local brightness variation model. If the cross entropy is larger than a threshold, global and local brightness variation compensation is applied using a multiplier (scale) and offset field. However, local parameters are only selected after the best matching block has been found, which can be disadvantageous when illumination mismatches are significant. Similarly, a second prior art approach proposes a modified motion estimation approach but a global illumination compensation model is used. Also, the second prior art approach proposes a block-by-block on/off control method, however such method is based on MSE. In a third prior art approach, an illumination mismatch problem in video sequences is addressed. In the third prior art approach, a scale/offset parameter for a 16×16 macroblock and predictive coding of the parameter are proposed. The third prior art approach also proposes a rate distortion cost based enabling switch. However, the third prior art approach mainly focuses on temporal video sequences. In video sequences, an illumination mismatch problem does not consistently occur as in cross-view prediction.