Video signals for transmission typically originate from video cameras. The bandwidth of these non-compressed video signals is substantial and, consequently, numerous attempts have been made to reduce the bandwidth of the signals for transmission without unduly degrading the images. Typically, to reduce bandwidth, the frames of video signals are encoded, and redundancies in consecutive frames of the encoded signals are extracted and deleted. Only the differences between consecutive frames are then transmitted. Various techniques are used in the art depending on the particular application. One of the techniques for further reducing the bandwidth of moving images is generally referred to as motion compensated predictive coding.
FIG. 1 illustrates one type of conventional video coder which utilizes motion compensation. Video signals in digital form are received at input 10. It will be assumed for FIG. 1 that the video data applied to input 10 is in a well known block format wherein blocks of 8.times.8 picture elements (pels) in an image are sequentially applied to input 10. The pels in a block are applied to input 10 in a raster scan type sequence.
A subtractor 12 outputs the difference between the current video signal and a predicted video signal from a predictor 14. Predictor 14 includes a first frame buffer containing the full reconstructed previous frame of video data. This frame buffer is relatively large and expensive since it must typically store at least 352.times.288 pels. If each pel requires 8 bits to encode, then the frame buffer must store about 811K bits. A 1 Mbit buffer is typically used.
Thus, the difference between the current video frame applied to input 10 and the predicted video frame outputted from predictor 14 is outputted from subtractor 12. The difference signals for an entire block are then transformed by a processor/quantizer 18 to generate transform coefficients using, for example, discrete-cosine transformation, and the coefficients are then quantized. The quantized coefficients are then encoded by a coder 20 to be in a conventional variable-length type code for additional compression of the video signal. The resulting difference signal is then outputted at output 21 for transmission to a receiver.
Motion estimator 16 compares the current block of video data received at input 10 with the data in a search window of the previous frame (in predictor 14) to identify that region or block of the reconstructed previous frame which the current block most closely resembles. The search window (e.g., 16.times.16 pels) takes into account the anticipated worst-case motion of a block from one frame to another. The search window is centered at the same location in the previous frame as the current block location in the current frame. Within this search window, the pels of the current block are successively compared to other pels in a block within the search window to find a matching block. The shift of the current block which gives the best match to the corresponding block in the previous frame is deemed to be the best estimate of the displacement of the block between frames. This best estimate is usually determined based on a mean squared error (MSE) or mean absolute difference (MAD) criteria. The amount of this best estimate shift, called the motion vector, is then transmitted to the receiver/decoder.
This motion vector is also applied to the address generator controlling the addressing of the first frame buffer in predictor 14 so that the block of pels outputted by predictor 14 corresponds to the displaced block of pels applied to input 10. Thus, the pels outputted by predictor 14 are motion compensated before being compared with the incoming block of pels, thereby making the predictor 14 output a better prediction of the current frame. This results in the difference outputted by subtractor 12 being, on average, smaller, and permits the coder 20 to encode the picture using a lower bit rate than would otherwise be the case.
Motion estimator 16 may, instead of performing block-matching motion estimation, use a well-known pel recursive technique, which generates motion vectors to minimize the prediction error at each pel. This is more computationally intensive than the block-matching technique.
After the motion vector and coded difference signal are transmitted, the receiver then updates the previous frame (already stored at the receiver) using the transmitted motion vector and difference signal. The frame buffer at the receiver must store the full frame, causing this frame buffer to be large and expensive.
The difference signal outputted by processor/quantizer 18 is also fed back to a second frame buffer in either predictor 14 or motion estimator 16 through an inverse processor/quantizer 22 and adder 24. The output of adder 24 is the motion compensated predicted frame in the first frame buffer plus the difference signal. Thus, the second frame buffer in predictor 14 or motion estimator 16 now stores essentially the current frame of the video image (identical to that stored at the receiver), while the first frame buffer still stores the previous frame. This second frame buffer, like the first frame buffer, stores a full frame of pels and is consequently large and expensive.
For the next frame applied to input 10, the functions of the second frame buffer (now storing the previous frame) and the first frame buffer are reversed, so that the second frame buffer outputs blocks of pels to subtractor 12. The above-described block-matching process is then repeated for the next frame.
If a determination is made that the pels of the block have not changed as compared with the previous frame (which is usually the case), a signal is transmitted which signifies to the receiver/decoder that it needs to merely repeat the pels of that block from the previous frame to obtain the pels for the current block.
The above-described video compression system is relatively large and expensive due to the complex circuitry needed to calculate the best estimate block shift and due to the need for large frame buffers capable of storing an entire frame. What is needed is a video compression system which is less expensive, smaller, and simpler than conventional video compression systems.