1. Field of the Invention
The present invention relates to an image sequence coding and decoding method which performs interframe prediction using quantized values for chrominance or luminance intensity.
2. Related Art
In high efficiency coding of image sequences, interframe prediction (motion compensation) by utilizing the similarity of adjacent frames over time, is known to be a highly effective technique for data compression. Today's most frequently used motion compensation method is block matching with half pixel accuracy, which is used in international standards H.263, MPEG1, and MPEG2. In this method, the image to be coded is segmented into blocks and the horizontal and vertical components of the motion vectors of these blocks are estimated as integral multiples of half the distance between adjacent pixels. This process is described using the following equation:
[Equation 1]P(x,y)=R(x+ui,y+vi(x,y)εBi,0≦i<N  (1)
where P(x, y) and R(x, y) denote the sample values (luminance or chrominance intensity) of pixels located at coordinates (x, y) in the predicted image P of the current frame and the reference image (decoded image of a frame which has been encoded before the current frame) R, respectively. “x” and “y” are integers, and it is assumed that all the pixels are located at points where the coordinate values are integers. Additionally it is assumed that the sample values of the pixels are quantized to non-negative integers. N, Bi, and (ui, vi) denote the number of blocks in the image, the set of pixels included in the i-th block of the image, and the motion vectors of the i-th block, respectively.
When the values for “ui” and “vi” are not integers, it is necessary to find the intensity value at the point where no pixels actually exist in the reference image. Currently, bilinear interpolation using the adjacent four pixels is the most frequently used method for this process. This interpolation method is described using the following equation:
[Equation 2]R(x+p/d, y+q/d=((d−q)((d−p)R(x,y)+pR(x+1,y))+q((d−p)R(x,y+1)+pR)(x+1,y+1)))//d2  (2)
where “d” is a positive integer, and “p” and “q” are smaller than “d” but not smaller than zero “0”. “//” denotes integer division which rounds the result of normal division (division using real numbers) to the nearest integer.
An example of the structure of an H.263 video encoder is shown in FIG. 1. As the coding algorithm, H.263 adopts a hybrid coding method (adaptive interframe/intraframe coding method) which is a combination of block matching and DCT (discrete cosine transform). A subtractor 102 calculates the difference between the input image (current frame base image) 101 and the output image 113 (related later) of the interframe/intraframe coding selector 119, and then outputs an error image 103. This error image is quantized in a quantizer 105 after being converted into DCT coefficients in a DCT converter 104 and then forms quantized DCT coefficients 106. These quantized DCT coefficients are transmitted through the communication channel, while at the same time used to synthesize the interframe predicted image in the encoder.
The procedure for synthesizing the predicted image is explained next. The above mentioned quantized DCT coefficients 106 forms the reconstructed error image 110 (same as the reconstructed error image on the receive side) after passing through a dequantizer 108 and inverse DCT converter 109. This reconstructed error image and the output image 113 of the interframe/intraframe coding selector 119 is added at the adder 111 and the decoded image 112 of the current frame (same image as the decoded image of current frame reconstructed on the receiver side) is obtained. This image is stored in a frame memory 114 and delayed for a time equal to the frame interval. Accordingly, at the current point, the frame memory 114 outputs the decoded image 115 of the previous frame. This decoded image of the previous frame and the original image 101 of the current frame are input to the block matching section 116 and block matching is performed between these images. In the block matching process, the original image of the current frame is segmented into multiple blocks, and the predicted image 117 of the current frame is synthesized by extracting the section most resembling these blocks from the decoded image of the previous frame. In this process, it is necessary to estimate the motion between the prior frame and the current frame for each block. The motion vector for each block estimated in the motion estimation process is transmitted to the receiver side as motion vector data 120.
On the receiver side, the same prediction image as on the transmitter side is synthesized using the motion vector information and the decoding image of the previous frame. The prediction image 117 is input along with a “0” signal 118 to the interframe/intraframe coding selector 119. This switch 119 selects interframe coding or intraframe coding by selecting either of these inputs. Interframe coding is performed when the prediction image 117 is selected (this case is shown in FIG. 2). On the other hand when the “0” signal is selected, intraframe coding is performed since the input image itself is converted, to a DCT coefficients and output to the communication channel.
In order for the receiver side to correctly reconstruct the coded image, the receiver must be informed whether intraframe coding or interframe coding was performed on the transmitter side. Consequently, an identifier flag 121 is output to the communication circuit. Finally, an H.263 coded bitstream 123 is acquired by multiplexing the quantized DCT coefficients, motion vectors, the and interframe/intraframe identifier flag information in a multiplexer 122.
The structure of a decoder 200 for receiving the coded bit stream output from the encoder of FIG. 1 is shown in FIG. 2. The H.263 coded bit stream 217 that is received is demultiplexed into quantized DCT coefficients 201, motion vector data 202, and an interframe/intraframe identifier flag 203 in the demultiplexer 216. The quantized DCT coefficients 201 become a decoded error image 206 after being processed by an inverse quantizer 204 and inverse DCT converter 205. This decoded error image is added to the output image 215 of the interframe/intraframe coding selector 214 in an adder 207 and the sum of these images is output as the decoded image 208. The output of the interframe/intraframe coding selector is switched according to the interframe/intraframe identifier flag 203. A prediction image 212 utilized when performing interframe encoding is synthesized in the prediction image synthesizer 211. In this synthesizer, the position of the blocks in the decoded image 210 of the prior frame stored in frame memory 209 is shifted according to the motion vector data 202. On the other hand, for intraframe coding, the interframe/intraframe coding selector outputs the “0” signal 213 as is.