Generally, there are disclosed various kinds of moving image processing systems such as a video conference system and a video phone. In the video phone, an image includes a space of; stationary picture such as a background image and additionally, the face and the bust of a human are the main objects to be expressed. Therefore, a correlation between the two continuous frames is very high because occurs a minor change; between the picture frames.
Utilizing such a feature of the images to be processed in the video phone, an interframe differential pulse code modulation (hereinafter called DPCM) has been proposed to reduce a redundancy in temporal direction. In order to increase a prediction efficiency of the interframe DPCM, there is disclosed a motion compensated interframe DPCM which can predict a change according to a motion of an object. Furthermore, of the various motion compensated interframe DPCMs, the BMA which detects a motion in a unit block is widely used.
A composite coding method combining the interframe DPCM with a discrete cosine transform (hereinafter called DCT) method detects a difference between a current image frame block to be encoded and a preceding image frame block by the DCT method. The composite coding method has a configuration as illustrated in FIG. 1.
With reference to FIG. 1, a format division 11 divides an input image of a unit frame into a number of blocks of a given size to thereby perform a formatting of the input image. A subtractor 21 receives the image blocks in sequence so as to detect a difference between the preceding and current frame information. A first data compressor 12 compresses output data of the subtractor 21 by the DCT method first. A second data compressor 13 quantizes the output of the first data compressor 12 thereby to further compress the output of the subtractor 21. A third data compressor 14 performs a variable length coding with respect to the quantized data by utilizing a statistical feature of quantized data.
In the meantime, a data expander 17 expands the output image signal of the second data compressor 13 by performing an inverse DCT. A frame memory 19 restores preceding frame information to store it thereto and generates motion compensated block data by a given control. A loop filter 18 filters the motion compensated block data.
An adder 22 adds the filtered motion compensated block data from the filter 18 to the expanded image signal from the data expander 17, so as to restore the preceding frame and store it in the frame memory 19. In this case, the frame memory 19 receives position information S2 from a motion detector 20. The position information S2 is a motion vector indicating a relative position of a block in the preceding frame which is similar to a block in the current frame, as shown in FIG. 3.
A multiplexer 15 transmits the quantization information S1 from the second data compressor 13 and the position information S2 from the motion detector 20 according to a format related to a transmission and receiving ends. A buffer 16 transmits an output result of the multiplexer 15 to the receiving end and generates a control signal CC for regulating a degree of data compression to the second data compressor 13 so as to be suitable for its input/output speed.
Receiving a difference signal and a motion vector during storing of the preceding frame information, the receiving end restores a current block by substituting a moving portion of the preceding frame information for a difference signal and a motion vector, and accordingly a continuous moving image may be expressed.
Based on the composite coding method, the BMA method and a distortion measure will be described hereinafter. The BMA represents a step for detecting a block of preceding image frames being most similar to a block to be coded in a current image frame. That is to say, this step is to compare a block of current image frames to be coded with a block of respective search position in a predicted search area of a preceding image frame, thereby to detect the most similar block.
In the composite encoder, the function of the BMA module is to detect a correct motion vector, and so a three-step search mode is widely used as a method for embodying a hardware easily and reducing a computational complexity for a real time system.
As illustrated in FIG. 4, the three-step search includes three steps and it is a kind of coarse-fine search to search roughly in a first step and search accurately in second and third steps on the assumption of the distortion measure between each block being smooth in a search area.
For computing the distortion measure, there are four functions such as NCCF (Normalized Cross-Correlation Function), MSE (Mean Square Error), MAE (Mean of the Absolute Error) and MNAE (Mean Number of bits necessary to binary code the Absolute Error). Among the four functions, the MSE is widely used because of its simple computation method. ##EQU1## where, M and N indicate size of the blocks, and I(m+i, n+j, t-.tau.) indicates an intensity of brightness of a (m,n)th image element of preceding image block being at a distance of (i,j) from a current block to be encoded.
It takes a predetermined time as described below to compute a distortion measure by using a digital signal processor (hereinafter called as DSP).
For computing equation (1), a difference between m and n-th pixels firstly stored and a difference between the corresponding respective MxN blocks is squared to each other and then accumulated by a SQRA instruction. The SQRA instruction is a general instruction for accumulating after squaring a value, wherein the squaring and accumulating are done in one cycle of the instruction. Further, the number of instruction cycles which is necessary may be expressed as follows. EQU M.times.N.times.25.times.4=100MN(cycles) (2)
where, the number 25 is a quantity of pixels to be searched as shown in FIG. 4 (in this case, 9 elements indicated as ".smallcircle." in the first step and 8 elements indicated as each of "x" and " " in the second and third steps), and the number 4 is a number of cycles required for one operation.
The result of Equation (2) means that 2 cycles per pixel is required for storing the difference and one cycle for performing the instruction of SQRA. For example, in computing the distortion measure of a 16.times.16 sized block by using a DSP chip of TMS 320C25 by the Texas Instruments Company, a processing time can be expressed as; EQU 16.times.16.times.25.times.4.times.100(nsec/cycle)=2.560.mu.sec(3)
According to the international telegraph and telephone consultative committee regulation H.261 for the video phone, it is recommendable that an input image have a size of 1/4 CIF and a spatial resolution be as shown in Table 1 in case of processing ten frames per second, and a macro block include a 16.times.16 luminance component Y and 8.times.8 color difference signals R-Y, B-Y. Accordingly, a pixel clock and one macro block duration for which one macro block consisting of 16.times.24 pixels should be processed can be expressed as equations (4) and (5).
TABLE 1 ______________________________________ Horizontal Resolution Vertical Resolution (pixels/line) (lines/frame) Image Luminance Color Diff. Luminance Color Diff. Format Signal Signal Signal Signal ______________________________________ 1 CIF 352 176 288 144 1/4 ClF 176 88 144 72 ______________________________________ Fp = (176.times.144+88.times.72) (pixels/frame) .times. 10 (4)ames/sec) = 380.16[Kpixels/sec] T.sub.B = 384/Fp=1010.1 (.mu.sec) (5) ______________________________________
Accordingly, input data to be processed is received by 380.16Kpixels/sec and an allocated time for processing the macro block including 384 pixels is limited to 1010.1.mu.sec. Therefore, it is noted that at least more than three DSP modules are necessary to implement the video phone in a real time, so as to search distortion measure information of one macro block within a time of processing one macro block. Because the motion detection speed decides how many DSP elements are required, the motion detection speed is very considerable for embodying a motion detector when it is considering an economical point of view as well as a simplicity of hardware configuration by reducing a hardware size.
Presently, there is known a real time motion detector using two DSP chips, TMS 320C25, by reducing a processing time and by further using the three-step BMA method of sub-sampling.
The motion detector is disclosed in the master's thesis entitled "Embodiment of a real time motion detector by using a DSP element" submitted to Korea Institute of Science and Technology by Ki-hwan KIM in 1990.
However, the above conventional method is unreasonable when an operation mode of the DSP is considered. Because each value is squared and accumulated once after storing a difference between corresponding pixels of the current and preceding frames, an operation feature of the DSP element that performs an addition and accumulation after performing multiplication, can not be fully utilized and furthermore, it is not effective in view of the operation speed.