1. Field of the Invention
Methods and apparatuses consistent with the present invention relate to video coding and decoding using a weighted prediction, and more particularly, to video coding and decoding using a weighted prediction which can reduce the amount of residual signal by generating a weighted prediction image by multiplying a predicted image of a present block by a specified scaling factor and by coding a residual signal obtained by subtracting the weighted prediction image from the present block.
2. Description of the Prior Art
With the development of information and communication technologies, multimedia communications are increasing in addition to text and voice communications. The existing text-centered communication systems do not satisfy the diverse desires of consumers, and thus multimedia services that can accommodate diverse forms of information such as text, images, music, and others are increasing. Since multimedia data is large, mass storage media and wide bandwidths are respectively required for storing and transmitting the multimedia data. Accordingly, compression coding techniques are required to transmit the multimedia data, which includes text, images and audio data.
The basic principle of data compression is to remove data redundancy. Data can be compressed by removing spatial redundancy such as a repetition of the same color or object in images, temporal redundancy such as little change of adjacent frames in moving image frames or continuous repetition of sounds in audio, and visual/perceptual redundancy, which considers the human insensitivity to high frequencies. In a general video coding method, the temporal redundancy is removed by temporal filtering based on motion compensation, and the spatial redundancy is removed by a spatial transform.
FIG. 1 is a view illustrating a prediction in a conventional video coding method.
Existing video codecs, such as MPEG-4 and H.264 codecs, raise the compression efficiency by removing the similarity between adjacent frames on the basis of motion compensation. Generally, a prediction of a similar image in a reference frame temporally preceding the present frame 110 is called a forward prediction 120, and a prediction of a similar image in a reference frame temporally following the present frame is called a backward prediction 130. A temporal prediction using both a forward reference frame and a backward reference frame is called a bidirectional prediction 140.
The existing single layer video codecs can improve their efficiency by selecting and coding using an optimum mode among various modes, as described above. On the other hand, multilayer video codecs, such as the H.264 scalable extension (or MPEG scalable video coding), use another prediction method, i.e., a base layer prediction method 150, in order to remove the similarity between layers. That is, the video codecs perform an image prediction using an image in a frame that is in the same temporal position as a block to be presently coded in a base layer image. In this case, if respective layers have different resolutions, the video codec performs the temporal prediction after up-sampling the base layer image and matching the resolution of the base layer image to the resolution of the present layer.
Although several reasons may exist for selecting a prediction mode, direct coding may be performed with respect to the respective prediction methods to select a method that is has a lower cost. The cost C may be defined in various ways, and the representative cost is calculated as in Equation (1) on the basis of a rate-distortion. Here, E denotes the difference between the original signal and the signal restored by decoding coded bits, and B denotes the number of bits required for performing the respective methods. Also, λ denotes a Lagrangian coefficient that can adjust a reflection rate of E and B.C=E+λB  (1)
Conventional video coding methods using temporal prediction are disclosed in many patent documents. For instance, Korean Patent Unexamined Publication No. 2004-047977 discloses a spatially scalable compression, and particularly a video coding method that includes calculating motion vectors for respective frames, based on the sum of an up-scaled base layer and an enhancement layer.
However, with the exception of the base layer prediction, the conventional video coding methods have the problem that they never use the large amount of information of the base layer.