Video signals for transmission typically originate from video cameras. The bandwidth of these non-compressed video signals is substantial and, consequently, numerous attempts have been made to reduce the bandwidth of the signals for transmission without unduly degrading the images. Typically, to reduce bandwidth, the frames of video signals are encoded, and redundancies in consecutive frames of the encoded signals are extracted and deleted. Only the differences between consecutive frames are then transmitted. Various techniques are used in the art depending on the particular application. One of the techniques for further reducing the bandwidth of moving images is generally referred to as motion compensated predictive coding.
FIG. 1 illustrates one type of conventional video coder which utilizes motion compensation. Video signals in digital form are received at input 10. It will be assumed for FIG. 1 that the video data applied to input 10 is in a well known block format wherein blocks of 8.times.8 picture elements (pels) in an image are sequentially applied to input 10. The pels in a block are applied to input 10 in a raster scan type sequence.
A subtractor 12 outputs the difference between the current video signal and a predicted video signal from a predictor 14. Predictor 14 includes a first frame buffer containing the reconstructed previous frame of video data. Thus, the difference between the current video frame applied to input 10 and the predicted video frame outputted from predictor 14 is outputted from subtractor 12. The difference signals for an entire block are then transformed by a processor/quantizer 18 to generate transform coefficients using, for example, discrete-cosine transformation, and the coefficients are then quantized. The quantized coefficients are then encoded by a coder 20 to be in a conventional variable-length type code for additional compression of the video signal. The resulting difference signal is then outputted at output 21 for transmission to a receiver.
Motion estimator 16 compares the current block of video data received at input 10 with the data in a search window of the previous frame (in predictor 14) to identify that region or block of the reconstructed previous frame which the current block most closely resembles. The search window (e.g., 16.times.16 pels) takes into account the anticipated worst-case motion of a block from one frame to another. The search window is centered at the same location in the previous frame as the current block location in the current frame. Within this search window, the pels of the current block are successively compared to other pels in a block within the search window to find a matching block. The shift of the current block which gives the best match to the corresponding block in the previous frame is deemed to be the best estimate of the displacement of the block between frames. This best estimate is usually determined based on a mean squared error (MSE) or mean absolute difference (MAD) criteria. The amount of this best estimate shift, called the motion vector, is then transmitted to the receiver/decoder.
This motion vector is also applied to the address generator controlling the addressing of the first frame buffer in predictor 14 so that the block of pels outputted by predictor 14 corresponds to the displaced block of pels applied to input 10. Thus, the pels outputted by predictor 14 are motion compensated before being compared with the incoming block of pels, thereby making the predictor 14 output a better prediction of the current frame. This results in the difference outputted by subtractor 12 being, on average, smaller, and permits the coder 20 to encode the picture using a lower bit rate than would otherwise be the case.
Motion estimator 16 may, instead of performing block-matching motion estimation, use a well-known pel recursive technique, which generates motion vectors to minimize the prediction error at each pel. This is more computationally intensive than the block-matching technique.
After the motion vector and coded difference signal are transmitted, the receiver then updates the previous frame (already stored at the receiver) using the transmitted motion vector and difference signal. Two frame buffers at the receiver may be needed to avoid altering search windows for neighboring blocks in the previous frame. One frame buffer would contain the previous frame, and the other frame buffer would store the motion compensated data.
The difference signal outputted by processor/quantizer 18 is also fed back to a second frame buffer in either predictor 14 or motion estimator 16 through an inverse processor/quantizer 22 and adder 24. The output of adder 24 is the motion compensated predicted frame in the first frame buffer plus the difference signal. Thus, the second frame buffer in predictor 14 or motion estimator 16 now stores essentially the current frame of the video image (identical to that stored at the receiver), while the first frame buffer still stores the previous frame.
For the next frame applied to input 10, the functions of the second frame buffer (now storing the previous frame) and the first frame buffer are reversed, so that the second frame buffer outputs blocks of pels to subtractor 12. The above-described block-matching process is then repeated for the next frame.
If a determination is made that the pels of the block have not changed as compared with the previous frame (which is usually the case), a signal is transmitted which signifies to the receiver/decoder that it needs to merely repeat the pels of that block from the previous frame to obtain the pels for the current block.
The primary goal of a video coder is to minimize the resultant transmitted bit rate of a video signal. Thus, a criterion for determining whether the motion estimator 16 has identified the best estimate for the movement of the block being analyzed is the entropy of the displaced block-difference, given a certain constraint on the block-difference quantization distortion. In other words, the better the motion estimate by motion estimator 16, the smaller the number of bits resulting from encoding the block difference, given a certain constraint on the quantization distortion generated.
The above "best estimate" techniques performed by conventional motion estimators are relatively complex and fairly expensive to implement as an integrated circuit. This complex circuitry frequently requires the motion estimator to be formed as a separate integrated circuit, thus amounting to about one-half of the video coder. What is needed is a motion estimation method and structure which uses as much conventional circuitry (and algorithms) as possible but which results in a less expensive, smaller, and simpler video coder/decoder (codec) having motion estimation.