In the highly efficient coding of a dynamic image, it has been known, in recognition of the similarity of the frames that are close to each other in regard to time, to use motion compensation in compressing the data. The most widely used motion compensation system at present image coding technology is block matching, employed in Standards H.261, MPEG1 and MPEG2 which are international standards for a dynamic image coding system. According to this system, the image to be coded is divided into a number of blocks, and a motion vector is found for each of the blocks.
FIG. 1 illustrates the constitution of a coder 100 of the H.261 Standard which employs a hybrid coding system (adaptive interframe/intraframe coding method) which is a combination of block matching and DCT (discrete cosine transform). A subtractor 102 calculates the difference between an input image (original image of present frame) 101 and an output image 113 (that will be described later) of an interframe/intraframe switching unit 119, and outputs an error image 103. The error image is transformed into a DCT coefficient through a DCT processor 104 and is quantized through a quantizer 105 to obtain a quantized DCT coefficient 106. The quantized DCT coefficient is output as transfer data onto a communication line and is, at the same time, used in the coder to synthesize an interframe predicted image. A procedure for synthesizing the predicted image will be described below. The quantized DCT coefficient 106 passes through a dequantizer 108 and an inverse DCT processor 109 to form a reconstructed error image 110 (the same image as the error image reproduced on the receiving side).
An output image 113 (that will be described later) of the interframe/intraframe switching unit 119 is added thereto through an adder 111, thereby to obtain a reconstructed image 112 of the present frame (the same image as the reconstructed image of the present frame reproduced on the receiving side). The image is temporarily stored in a frame memory 114 and is delayed in time by one frame. At the present moment, therefore, the frame memory 114 is outputting a reconstructed image 115 of the preceding frame. The reconstructed image of the preceding frame and the input image 101 of the present frame are input to a block matching unit 116 where block matching is executed.
In the block matching, an image is divided into a plurality of blocks, and a portion most resembling the original image of the present frame is taken out for each of the blocks from the reconstructed image of the preceding frame, thereby synthesizing a predicted image 117 of the present frame. At this moment, it is necessary to execute a processing (local motion estimation) for detecting how much the blocks have moved from the preceding frame to the present frame. The motion vectors of the blocks detected by the motion estimation are transmitted to the receiving side as motion data 120. From the motion data and the reconstructed image of the preceding frame, the receiving side can synthesize an estimated image which is the same as the one that is obtained independently on the transmitting side.
Referring again to FIG. 1, the estimated image 117 is input together with a "0" signal 118 to the interframe/intraframe switching unit 119. Upon selecting either of the two inputs, the switching unit switches the coding either the interframe coding or the intraframe coding. When the predicted image 117 is selected (FIG. 2 illustrates this case), the interframe coding is executed. When the "0" signal is selected, on the other hand, the input image is directly DCT-coded and is output to the communication line. Therefore, the intraframe coding is executed.
In order to properly obtain the reconstructed image on the receiving side, it becomes necessary to know whether the interframe coding is executed or the intraframe coding is executed on the transmitting side. For this purpose, a distinction flag 121 is output to the communication line. The final H.261 coded bit stream 123 is obtained by multiplexing the quantized DCT coefficient, motion vector, and interframe/intraframe distinction flag into multiplexed data in a multiplexer 122.
FIG. 2 illustrates the constitution of a decoder 200 for receiving a coded bit stream output from the coder of FIG. 1. The H.261 bit stream 217 that is received is separated through a separator 216 into a quantized DCT coefficient 201, a motion vector 202, and an intraframe/interframe distinction flag 203. The quantized DCT coefficient 201 is decoded into an error image 206 through a dequantizer 204 and an inverse DCT processor 205. To the error image is added an output image 215 of an interframe/intraframe switching unit 214 through an adder 207 to form a reconstructed image 208.
The interframe/intraframe switching unit switches the output according to the interframe/intraframe coding distinction flag 203. A predicted image 212 that is used for executing the interframe coding is synthesized by a predicted image synthesizer 211. Here, the decoded image 210 of the preceding frame stored in the frame memory 209 is subjected to a processing of moving the position of each of the blocks according to the motion vector 202 that is received. In the case of intraframe coding, on the other hand, the interframe/intraframe switching unit outputs the "0" signal 213.
Block matching is a motion compensation system that is now most widely utilized. When the whole image is expanding, contracting, or turning, however, the motion vectors of all of the blocks must be transmitted, causing a problem of low coding efficiency. To solve this problem, global motion compensation (e.g., M. Hotter, "Differential Estimation of the Global Motion Parameters Zoom and Pan", Signal Processing, Vol. 16, No. 3, pp. 249-265, Mar., 1989) has been proposed to express the motion vector field of the whole image while not using many parameters. According to this motion compensation system, the motion vector (ug(x, Y), vg(x, y)) of a pixel (x, y) in an image is expressed in the form of: EQU u.sub.g (x,y)=a.sub.0 x+a.sub.1 y+a.sub.2 EQU .nu..sub.g (x,y)=a.sub.3 x+a.sub.4 y=a.sub.5 Equation 1
or EQU u.sub.g (x,y)=b.sub.0 xy+b.sub.1 x=b.sub.2 y+b.sub.3 EQU .nu..sub.g (x,y)=b.sub.4 xy+b.sub.5 x+b.sub.6 y+b.sub.7 Equation 2
and the motion compensation is executed using the motion vectors. In these equations, a0 to a5 and b0 to b7 are motion parameters. In executing the motion compensation, the same predicted image must be generated both on the transmitting side and on the receiving side. For this purpose, the transmitting side may directly transmit values of a0 to a5 or b0 to b7 to the receiving side or may instead transmit motion vectors of several representative points.
As shown in FIG. 3A, assume that the coordinates of the pixels at the left upper, right upper, left lower and right lower corners of an image 301 are expressed by (0, 0), (r, 0) (0, s) and (r, s) (where r and s are positive integers). Here, letting the horizontal and vertical components of the motion vectors of the representative points (0, 0), (r, 0) and (0, s) be (ua, va), (ub, vb) and (uc, vc), respectively, Equation 1 is rewritten as: ##EQU1##
This means that the same function can be fulfilled even when ua, va, ub, vb, uc and vc are transmitted instead of transmitting a0 to a5. This state is shown in FIGS. 3A and 3B. The motion vectors 306, 307 and 308 (the motion vectors are defined to start from points of the original image of the present frame and ends at the corresponding points in the reference image) of the representative points 303, 304 and 305 may be transmitted instead of the motion parameters based on the assumption that global motion compensation between the original image 302 of the present frame shown in FIG. 3B and the reference image 301 shown in FIG. 3A is effected. Similarly, by using the horizontal and vertical components (ua, va), (ub, vb), (uc, vc) and (ud, vd) of four representative points (0, 0), (r, 0), (0, s) and (r, s), Equation 2 can be rewritten as: ##EQU2##
Therefore, a similar function is fulfilled even when ua, va, ub, vb, uc, vc, ud and vd are transmitted instead of b0 to b7. In this specification, the system using Equation 1 is referred to as global motion compensation based upon linear interpolation and/or extrapolation, and the system using Equation 2 is referred to as global motion compensation based upon the bilinear interpolation and/or extrapolation.
FIG. 4 illustrates the constitution of a motion compensation section 401 of an image coder employing the global motion compensation system based upon linear interpolation and/or extrapolation for transmitting motion vectors of the representative points. The same components as those of FIG. 1 are denoted by the same reference numerals. A video coder that executes global motion compensation can be constituted by substituting a motion compensation section 401 for the block matching unit 116 of FIG. 1.
A global motion compensation unit 402 performs motion estimation related to the global motion compensation between the decoded image 115 of the preceding frame and the original image 101 of the present frame, and estimates the values ua, va, ub, vb, uc and vc. The data 403 related to these values are transmitted as part of the motion data 120. A predicted image 404 of global motion compensation is synthesized using Equation 3, and is fed to a block matching unit 405. The motion is compensated by block matching between the predicted image of global motion compensation and the original image of the present frame, thereby generating motion vector data 406 of blocks and a final predicted image 117. The motion vector data and the motion parameter data are multiplexed through a multiplexing unit 407 and are output as motion data 120.
FIG. 5 illustrates the constitution of a motion compensation section 501 which is different from that of FIG. 4. A video coder that executes global motion compensation can be constituted by substituting a motion compensation section 501 for the block matching unit 116 of FIG. 1. In this embodiment, block matching is not adopted for the predicted image of global motion compensation but either global motion compensation or block matching is adopted for each of the blocks. Global motion compensation and block matching are executed in parallel by the global motion compensation unit 502 and the block matching unit 505 between the decoded image 115 of the preceding frame and the original image 101 of the present frame. A selection switch 508 selects an optimum system for each of the blocks between the predicted image 503 of global motion compensation and the predicted image 506 of block matching. The motion vectors 504 of the representative points, motion vectors 507 of the blocks and selection data 509 of global motion compensation/block matching are multiplexed by the multiplexing unit 510 and are output as motion data 120.
By introducing the above-mentioned global motion compensation, it becomes possible to express the general motion of the image using few parameters and to accomplish a high data compression ratio. However, the amounts of coding processing and decoding processing become larger than those of the conventional systems. In particular, the division in Equations 3 and 4 is a major factor of complexity in the processing.